DATA ARCHITECTURE: A BRIEF HISTORY
Bill Inmon
Founder, Chairman, CEO, Best-Selling Author, University of Denver & Scalefree Advisory Board Member
A BRIEF HISTORY OF DATA ARCHITECTURE
(Part 3 of 4)
By W H Inmon
The data architect lives in the world of today. And that world is constantly changing. To understand the data architect, it makes sense to stand back and see how data architecture has progressed over the years.
That progression of data architecture is seen as –
In the beginning there were applications. These applications were new to the world and covered all sorts of territory – human resources, engineering, inventory management, and so forth. These applications were supplied by many different vendors over many different technologies. Each of these applications were designed to operate independently.
Some of these applications were very useful and some were not. Some of these applications stood the test of time and some didn’t. Many of these applications were written by vendors who are not in business today.
The technicians used these applications to make high salaries because corporations needed these applications and needed someone who understood them and to run them.
A byproduct of the execution of the applications is data. This data is often known as the legacy data environment. The legacy data is the data that spewed out as a result of the processing of the legacy applications. The legacy data was very disorganized. It came from different vendors and was never designed to be integrated into a cohesive structure. But it was collected in any case.
领英推荐
Yet the legacy data was the data that formed the early foundation for analytical processing in the corporation.
From the miasma of the legacy environment came the data warehouse. The data warehouse mandated that the legacy data be integrated and organized. To this end came ETL – extract/transform/load. ETL served the purpose of integrating the legacy data. Once the organization had a data warehouse.
The data warehouse contained the integrated, detailed data – the bedrock data – that depicted how the corporation was being run. That detailed data could be reshaped to meet the needs of different organizations. Yet there was still reconcilability of data as long as it came from the data warehouse because there was a solid, believable foundation of bedrock data.
Soon it was noticed that other architectural structures were needed – principally the ODS – operational data store – and data marts. Each department needed it’s own interpretation of the data found in the data warehouse. That is where data marts fit.
Some organizations needed real time integrated data. The ODS served that purpose.
Then, over time, a whole host of other types of data started to appear. There was textual data. There was analog data. There was data from the Internet. At first, these different forms of data did not fit well into the data architecture.
But over time these different types of data grew into a larger structure – the data lakehouse. Soon the data warehouse grew to accept all sorts of different kinds of data and was turned into the data lakehouse.
The architectural rendition presented here has omitted many details in order to keep the article short and readable.. The purpose of the discussion is to describe the general flow of data architecture, not to be a tutorial on data architecture.
The progression that has been described is the progression that the data architect has lived through. Understanding that progression helps to understand the data architect and the challenges faced by the data architect.
Bill Inmon lives in Denver with his wife and his two Scotty dogs – Jeb and Lena. Lena was groomed a few days ago and she shows off, like a Scotty dog. But tomorrow she is having her teeth cleaned and she doesn’t like that.
Senior Data Architect at InfoBuild (Pty) Ltd. | Expert in Data Warehousing
1 年As always, spot on and to the point, Bill! Great.
Data Architect | Data Platform Tuning, Design, Modeling, and Migration | Snowflake | Databricks | Teradata
1 年I have been following this 4-part blog series word for word. All these insights are 24k gold, thank you for sharing them Bill.
Chief Operating Officer at DataVaultAlliance Holdings and President of DataRebels
1 年Bill Inmon I so appreciate your insights and understanding of where so much of our industry originated, especially the thinking around the problems encountered. Another episode in history gifted from a genuine national treasure. Thanks, Bill.