登录查看更多内容

Sustainable Data Architectures Through Data Architecture Automation

Rick van der Lans

Founder of R20/Consultancy BV

发布日期: 2021年10月25日

Countless organizations are developing new data platforms with modern architectures. Some are based on data warehouses, others on data lakes, data hubs or data lakehouses. Regardless of the type of architecture, today sustainability is the key requirement for each new data architecture. Organizations demand data platforms with sustainable data architectures.

A sustainable data architecture is not one that only supports current data requirements and all newly identified requirements, but one that is adaptable and extensible enough to support (yet unknown) future requirements. The latter is important to be able to deal with completely unexpected and unplanned changes that lead to new requirements. For example, an organization may acquire another company that operates in a different industry with totally different data requirements; or urgent market developments resulting from business opportunities, disasters, or new aggressive competitors, can also lead to new data requirements.

A sustainable data architecture is one that can survive for a long time, because it is easy to adapt and extend. A sustainable data architecture enables an organization to quickly implement current, known upcoming and yet unknown future data requirements. When data consumption requirements change, the data architecture adapts accordingly without the need for major reengineering and redevelopment exercises. Sustainable data architectures are nimble.

But this is all easier said than done. Data warehouse automation tools can come to the rescue. Basically, data warehouse automation tools are generators. Generators transform higher-level specifications into lower-level specifications executed by specific runtime technologies, such as compilers, database servers, messaging products, or ETL tools.

In the world of data warehouses and other data architectures, we have been using generators for a long time. However, most of those generators produce one component of an entire data platform, such as an application or database. For example, an ETL tool generates ETL programs, BI tools generate SQL statements, and data modeling tools generate data structures.

This means that multiple, independent generators are required to generate an entire data platform. Since the generators require similar specifications, they are defined multiple times, or in other words they are duplicated. The challenge is to keep all those specifications consistent, to make sure they work together optimally, and to guarantee that if one specification is changed, all the duplicate specifications are changed accordingly.

Many of the tasks involved in designing and developing data platforms are quite repetitive and formalizable. That makes these tasks suitable for generators. For example, when an enterprise data warehouse uses a data vault design technique and the physical data marts use star schemas, both can be generated from a central data model including the ETL code to copy the data from the warehouse to the data marts.

领英推荐

The present and future of Data Architecture: The…

Plain Concepts 12 个月前

Data as a Product - Data architecture the management…

Dr. RVS Praveen Ph.D 1 年前

Data Architecture Patterns: Choosing the Right Approach

Sanjay Kumar MBA,MS,PhD 6 个月前

The principles that apply to generators of individual platform components can also be applied to generators of entire data architectures. This category of generators operating on the architectural level is called data warehouse automation tools. They do not generate code for one component of the architecture, but for several and sometimes for the entire architecture. Traditional data warehouse automation tools generate, for example, staging areas, enterprise data warehouses, physical data marts, the ETL solutions that copy data from one database to another, and metadata. Several of these tools have been on the market for many years and have proven their worth.

A limitation of various data warehouse automation tools is that they only generate data platforms with traditional data warehouse architectures. Such platforms are only suitable for a limited set of data consumption forms. In other words, they are single-purpose data platforms. That does not make them very sustainable.

To develop sustainable data architectures, generators are required that can generate other data architectures in addition to the more traditional data warehouse architectures, such as data lake and data hub architectures. Such generators can be used to generate data platforms for other forms of data consumption than those supported by data warehouse architectures. If architects want to replace physical data marts developed with SQL databases with virtual data marts implemented with SQL views, the generator should support this. Or, if the central data warehouse needs to be replaced with a more datahub-like solution, or the ETL-solution with a streaming solution, the generator should make this all possible by simply regenerating the platform.

These generators exist. For them the term data warehouse automation is probably a misnomer. Data architecture automation tool is more appropriate.

The whitepaper ‘Sustainable Data Architectures Using Data Warehouse Automation’ describes the need for such automation tools in more detail and explains how WhereScape qualifies as data architecture automation tool: https://www.wherescape.com/resources/whitepaper-sustainable-data-architectures-using-data-warehouse-automation/

Rémy Fannader

Author of 'Enterprise Architecture Fundamentals', Founder & Owner of Caminao

3 年

For enterprises immersed in competitive digital environments, "Sustainable Data Architectures" can be compared to dry docks. https://caminao.blog/enterprise-architecture-fundamentals-the-book/book-pick-data-information-knowledge/

1 次回应

要查看或添加评论，请登录

Rick van der Lans的更多文章

The Multi-Hop Data Architecture Addiction

2023年11月9日

The Multi-Hop Data Architecture Addiction

Explaining Multi-hop and Single-Hop Data Architectures The term multi-hop data architecture is used for architectures…

1 条评论
APIs Constrain Organizations to Exploit Their Data

2023年11月7日

APIs Constrain Organizations to Exploit Their Data

The Sunny Side of APIs It is generally recommended to access data managed by systems, applications, databases, or other…

5 条评论
Integrated Data or Integrateable Data?

2020年12月3日

Integrated Data or Integrateable Data?

In different contexts, I am more and more confronted with an architectural discussion about whether data should be…

3 条评论
Does the Monolithic, Centralized, Multi-Domain Data Platform Make Sense?

2020年12月2日

Does the Monolithic, Centralized, Multi-Domain Data Platform Make Sense?

The data warehouse, the data lake, the data hub, and also the data lake house differ in many ways. But they have a few…

2 条评论
Data Herding Is Not Data Integration!

2020年11月16日

Data Herding Is Not Data Integration!

In recent years, many new terms have been introduced in the IT industry that start with the word data, such as data…

6 条评论
Becoming a Data-driven Organization Requires a Cultural Change

2020年9月17日

Becoming a Data-driven Organization Requires a Cultural Change

Organizations can’t become data-driven simply by purchasing some new data processing tools, moving applications and…

1 条评论
Fivetran, a Data Warehouse Out of the Box

2020年9月16日

Fivetran, a Data Warehouse Out of the Box

Sometimes you overlook products. I had completely missed Fivetran, while they have been around since 2012.
Cohelion, an All-in-one Data Warehouse Factory

2020年7月22日

Cohelion, an All-in-one Data Warehouse Factory

Generally, many tools are required to develop a full-blown data warehouse environment, including ETL tools, database…

1 条评论
How Did Algorithm Become a Dirty Word?

2019年11月4日

How Did Algorithm Become a Dirty Word?

More and more often I hear and read about how bad algorithms are. Algorithms are held accountable for profiling people,…

4 条评论
Applications Come and Go, Data Stays

2019年10月21日

Applications Come and Go, Data Stays

While working in the IT industry for over thirty years, I have noticed that quite often there is a silent battle going…

46 条评论

See all articles

Sustainable Data Architectures Through Data Architecture Automation

Rick van der Lans

Founder of R20/Consultancy BV

领英推荐

Rick van der Lans的更多文章

社区洞察

其他会员也浏览了

Revolutionizing Data Engineering: The Power of Data Mesh Over Traditional Architectures

Modern Data Architecture: A Comprehensive Analysis of Lake, Lakehouse, and Beyond

Part 1: Data Architecture - Beyond the Buzzword

Establishing Foundation Knowledge: "The Data Warehouse Toolkit" by R. Kimball and M. Ross

Hyper-Scalable Data Architectures: Unleashing the Value of Your Data

The Evolution of Modern Data Architecture: From Data Warehouses to Data Mesh and Beyond

Data Architecture Trends in 2025: Are They Ready for Enterprise Adoption?

Significance of Data Architecture: Uncovering best practices to follow for data success

Data Vault – A Modern Architecture for Your Enterprises

DATA ARCHITECTURE: A BRIEF HISTORY

领英推荐

Rick van der Lans的更多文章

The Multi-Hop Data Architecture Addiction

APIs Constrain Organizations to Exploit Their Data

Integrated Data or Integrateable Data?

Does the Monolithic, Centralized, Multi-Domain Data Platform Make Sense?

Data Herding Is Not Data Integration!

Becoming a Data-driven Organization Requires a Cultural Change

Fivetran, a Data Warehouse Out of the Box

Cohelion, an All-in-one Data Warehouse Factory

How Did Algorithm Become a Dirty Word?

Applications Come and Go, Data Stays

社区洞察

其他会员也浏览了

Revolutionizing Data Engineering: The Power of Data Mesh Over Traditional Architectures

Modern Data Architecture: A Comprehensive Analysis of Lake, Lakehouse, and Beyond

Part 1: Data Architecture - Beyond the Buzzword

Establishing Foundation Knowledge: "The Data Warehouse Toolkit" by R. Kimball and M. Ross

Hyper-Scalable Data Architectures: Unleashing the Value of Your Data

The Evolution of Modern Data Architecture: From Data Warehouses to Data Mesh and Beyond

Data Architecture Trends in 2025: Are They Ready for Enterprise Adoption?

Significance of Data Architecture: Uncovering best practices to follow for data success

Data Vault – A Modern Architecture for Your Enterprises

DATA ARCHITECTURE: A BRIEF HISTORY