Member-only story
Why does cloud make Data Lakes better?
A data lake is conceptual data architecture which is not based on any specific technology. So, the technical implementation can vary technology to technology, which means different types of storage can be utilized, which translates into varying features.
A data lake is a vast pool of raw data, the purpose for which is not yet defined. A data warehouse is a repository for structured, filtered data that has already been processed for a specific purpose. Data lakes allow for a combination of structured and unstructured data, which tends to be a better fit for healthcare companies.
The main focus of a data lake is that it is not going to replace a company’s existing investment in its data warehouse/data marts. In fact, they complement each other very nicely. With a modern data architecture, organizations can continue to leverage their existing investments, begin collecting data they have been ignoring or discarding, and ultimately enable analysts to obtain insights faster. Employing cloud technologies translates costs to a subscription-based model which requires much less up-front investment for both cost and effort.
The most of the organizations are enthusiastically considering cloud for functions like Hadoop, Spark, data bases, data warehouses, and analytics applications. This makes sense to build their data lake in the cloud…