Data Engineering — Construct a Standard Data Pipeline Design Pattern

Ryan Arjun
6 min readNov 3, 2023

One common challenges encountered by developers and data professionals in the area of data processing and analysis is whether to import data directly into a database or store it as.csv files before loading. This decision is crucial considering it concerns data continuity, error logging, and data retrieval in the event of a system failure.

The standard design pattern is to bring in data into a blob storage called raw, standardised it into parquet or any other columnar format, and then load it into a database system…

--

--

Ryan Arjun

BI Specialist || Azure || AWS || GCP — SQL|Python|PySpark — Talend, Alteryx, SSIS — PowerBI, Tableau, SSRS