Should I prefer ELT over ETL?

Should I prefer ELT over ETL?

ETL = Extract Transform Load

ELT = Extract Load Transform

BI = Business Intelligence

Over last two decades, ETL process was core of every BI solution. Using ETL once can simply retrieve the data from source database, transform it based on how downstream system wants to consume it and finally load the data in target database.

This process worked certainly, when you know well in advance how you want to consume data or in which format/structure you would like to keep into target data warehouse. Also when you had to keep database and ETL infrastructure resources separately.

Now a days when we are moving towards data intensive applications, each and every bit of data could be useful for organization. Without determining in advance how to consume and persist the data in data warehouse, businesses are interested in dumping the data (from wherever they can get) in centralize repository (sometimes called as Data Lake). Later figuring out how to consume it in meaning full manner. In order to resonate with this thought, focus is shifting from ETL to ELT

Another catalyst is the growing popularity of cloud data warehouses like Snowflake,Redshift which can scale on demand and provide extremely cheaper storage, one need not to think twice storing(load) the data and later provisioning(Transform) to support different use cases. On other hand these cloud data warehouses provides ease of querying semi structured data like JSON/XML

No alt text provided for this image


When to prefer ELT over ETL

  • When you want to avoid uncertainty of determining downstream use cases well in advance
  • To achieve faster implementation cycle and reduce operation overhead of ETL tool vendor management and support
  • Leverage offerings of cloud databases such as extremely cheaper storage, compute and storage resource segregation etc.
  • Reduce change management, defect fixing or application enhancement time cycles
  • No dedicated ETL Infrastructure required: Savings in Total Cost of Ownership


Amandeep Singh

Senior Machine Learning Engineer (Google Cloud Platform | AZURE | AWS | Data Science | AI | ML | MLOPS | Python | Spark | Kafka | Java | Scala | Spring Boot | DBT| Cloud Data Fusion| DataForm | Azure Synapse | SnowPark)

4 年

Agreed !! But for real-time streaming analytics, we have to choose ETL on top of ELT.

要查看或添加评论,请登录

Rahul Jain ????的更多文章

社区洞察

其他会员也浏览了