THE PROBLEM WITH ELT

No alt text provided for this image

THE PROBLEM WITH ELT

By W H Inmon

A long time ago the world woke up with a problem with data. There were all of these systems that had similar data and there was no way to connect and analyze the data. No one knew what data was right and what data was wrong. Data could not be shared among all the different and diverse sources with any degree of confidence.

Into that world came data warehouse. In data warehouse, there arose the recognition that data needed to be transformed before it could be shared across the corporation. Someone needed to go into the many disparate systems of data and find out what was right and what was wrong. This process of determining the system of record depended on a process called ETL – extract/transform/load.

In ETL data is extracted. It is transformed. And only after it is transformed is that data then loaded into the corporate data base – the data warehouse.

Only after the data was transformed was it loaded into the data warehouse. In doing so a fundamentally sound foundation of data was formed for corporate analysis of data. Now you could start to look across data in the corporation in a unified, believable manner.

The process of transformation was never an easy process. It was always complex and time consuming. It required a four letter word that is universally hated around the world. It required – work.

Then one day someone invented the process known as ELT – extract/load/transform. In this process we first extract the data. We load it in. And then we transform it. It sounds simple.

ELT sounds good as a theory. But it is woefully inadequate in practice. In fact, ELT leads to a really nasty problem. What happens is that people extract data. They load it into wherever it is going. And then they conveniently forget to transform it. They have a thousand excuses why they can’t transform the data –

??There is too much of data to consider transformation

??Transformation is a cumbersome, complex process. It requires work

??Transformation requires a lot of resources

??We don’t have time for transformation

??We don’t really know how to do transformation.

So what happens is that transformation is forgotten and just does not occur. E occurs. L occurs. But T is conveniently tucked away and is never done.

It reminds me of a story (a true story). When I was young, my mother assigned my older sister to make lunch for us. My older sister (God love her) made soup. She made sandwiches. She scooped some ice cream into a bowl. Then she ate the ice cream and threw the soup and sandwiches away. As a young child I just knew this wasn’t right. You could not eat your ice cream until you had your nutrition.

The people that want to do ELT want the easy way out. They don’t want to do the hard work of transformation. Transformation is complex. It is messy. It requires resources. But it also means that data across the enterprise is usable. Without transformation, there is no corporate understanding of data.

So what do these geniuses that gave us ELT end up with? They end up with a myriad of data bases that can’t be related to each other. They end up with a big mess on their hands. There is no corporate data when you do ELT.

Doesn’t that sort of sound like where we started this whole commotion? ELT is a blind alley that has enormous negative consequences because it is almost never implemented properly.

Bill Inmon is the father of the data warehouse. Bill’s company – Forest Rim Technology – creates software that does textual disambiguation. With textual disambiguation you can read raw text and turn that raw text into a data base that can then be analyzed. Bill is in Denver, Colorado.

Zoran Milokanovi?

Power BI Trainer and Consultant at BitanBit

1 年
Mi?a Pavlovi?

Empowering Enterprises with Scalable, AI-Driven Data Platforms

2 年

Actually this is in my experience only one aspect of the problem. When you do a layered transformation e. g. core layer to core layer, you are creating a derived value which is then stored in the DWH. I would say that such data must be subjected to a Change Data Capture to ensure lineage according to some regulators (see BCBS 239, 249, etc.). The exact same principle should apply to ELT transformations. So it is actually much less convenient to manage your single source of truth with ELT.

Simon Wilson

Contractor Providing Data Architecture, Governance, Migration and Strategy Services.

2 年

I think the basic point you're all missing here, is that the ETL system is more reliable, as you have to do the Transformation work before you can do the Loading. Therefore, the Transformation work in an ETL system is always going to get done. Whereas the Transformation work in the ELT system can be forgot (whether that's by accident or on purpose), leaving you with an incomplete process and a load of garbage in your database. I completely agree with you Bill! ??

回复
Joni Ahola

Senior Manager, SAP Analytics Country Lead (FIN) at Accenture

3 年

I am a believer in both ways of extracting and integrating data (in context of ETL vs ELT). I am also 100% of the same opinion that the hard part is often forgotten in ELT, as Bill describes in the article, which results in poor state of affairs in your Data Platform. Looking at this from a bit wider point of view, we are facing a similar issue with enterprise data platforms vs Embedded, or Operational Analytics capabilities from ERP and other operational applications. I work a lot with SAP Analytics. We are often challenged on why should companies use SAP Analytics (eg S4 Embedded, SAC, DWC...BW4) if their strategy is to use enterprise platforms such as Azure, GCP, AWS platforms to load all data into and make it available for everyone. Of course there is no one answer to this but a few points to consider: 1. Try to use tools available (and often included in the licence) as close to the source data and application as possible to support business operations and decision making. This makes information availability timely and often utilising the logic embedded in the prosess, and minimum transformation (or extraction) is required. 2. Find the best fit for the circumstance. Dont be bullheaded and stubbornly dump everything into the enterprise platform if it adds no value. Look at the tools and processes that best fit the use / business case. I am not saying to add tools and applications into the landscape as you please, but work with the standards and principles of your enterprise architecture strategy. 3. You can do both. You can utilise the tools close to the source solution to build analytics that support the focus scneario, and at the same time integrate the same data into the enterprise platform. Be careful however, to apply the same rules and transformations as exist in the business application ...and this draws us back to the question in hand, ETL vs ELT ??

Ajay Tripathy

Architecting Scalable Data Solutions | Fostering BI & Analytics Cultures

3 年

It depends how you design your system. In today's date we are having powerful databases (e.g. #Snowflake, #Redshift, #Bigquery etc). And we should not look at ETL as a single process rather we should break it. Bringing data from different sources till PSA or RAW layer is purely data ingestion and we dont need any transformation. When we start curating the data for our EDW layer or for the data mart with specific business logic, we apply transformation. I believe thats the advantage. We dont need to transform everything we ingest. Transformation can happen on a need basis.

回复

要查看或添加评论,请登录

Bill Inmon的更多文章

  • POW WOW DENVER - MARCH 2025

    POW WOW DENVER - MARCH 2025

    THE DENVER POW WOW – March 2025 It was a lazy mid March Saturday afternoon and it was a warm day in Denver. Every year…

    1 条评论
  • STREAMLINING THE EMERGENCY ROOM - TEXTUAL ETL

    STREAMLINING THE EMERGENCY ROOM - TEXTUAL ETL

    STREAMLINING THE EMERGENCY ROOM By W H Inmon The emergency room of the hospital is where people turn to when they have…

    2 条评论
  • THE TEXT MAZE

    THE TEXT MAZE

    THE TEXT MAZE By W H Inmon A really interesting question is – why does text befuddle the computer? The fact that 80% or…

    2 条评论
  • BLAME IT ALL ON GRACE HOPPER

    BLAME IT ALL ON GRACE HOPPER

    BLAME IT ALL ON GRACE HOPPER By W H Inmon One of the more interesting aspects about the world of IT is that IT people…

    17 条评论
  • ASSOCIATIVE RECALL AND REALITY

    ASSOCIATIVE RECALL AND REALITY

    ASSOCIATIVE RECALL AND REALITY By W H Inmon A while back, on a Saturday night, my wife and I were looking for a movie…

    7 条评论
  • A FIRESIDE CHAT WITH BILL INMON

    A FIRESIDE CHAT WITH BILL INMON

    A FIRESIDE CHAT WITH BILL INMON Get Bill’s perspective on your IT organization and its initiatives. Come spend an hour…

  • MESSAGE TO ELON

    MESSAGE TO ELON

    MESSAGE TO ELON By W H Inmon Yesterday Elon Musk tweeted a message asking if anyone had some innovative ways to improve…

    73 条评论
  • GREAT EXPECTATIONS:WALT DISNEY AND THE PENTAGON

    GREAT EXPECTATIONS:WALT DISNEY AND THE PENTAGON

    GREAT EXPECTATIONS: WALT DISNEY AND THE PENTAGON By W H Inmon Think of all the delight Walt Disney has brought the…

    5 条评论
  • BUILDING THE LLM - PART VI

    BUILDING THE LLM - PART VI

    BUILDING THE LLM – Part VI By W H Inmon The language model is an interesting piece of technology. There are many facets…

    3 条评论
  • BUILDING THE LLM - PART V

    BUILDING THE LLM - PART V

    BUILDING THE LLM – Part V By W H Inmon The generic industry language model has at a minimum three important elements of…

    2 条评论

社区洞察

其他会员也浏览了