登录查看更多内容

Data Vault Modeling

Murali Krishna Vysyaraju (TOGAF Certified)

Assistant Vice President - Genpact

发布日期: 2016年3月13日

The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. (The formal definition as written by the inventor Dan Linstedt)

It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent, and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses.

The main point here is that Data Vault (DV) was developed specifically to address agility, flexibility, and scalability issues found in the other main stream data modeling approaches used in the data warehousing space. It was built to be a granular, non-volatile, auditable, historical repository of enterprise data.

At its core is a repeatable modeling technique that consists of just three main types of tables:

Hubs = Unique list of Business Keys
Links = Unique List of Associations / Transactions
Satellites = Descriptive Data for Hubs and Links (Type 2 with history)

Hubs make it business driven and allow for semantic integration across systems.

Links give you the flexibility to absorb structural and business rule changes without re-engineering (and therefore without reloading any data).

Satellites give you the adaptability to record history at any interval you want plus unquestionable auditability and traceability to your source systems.

Here is a simple example of what at Data Vault 2.0 model looks like:

Reference Article https://www.snowflake.net/blog/ability-to-connect-to-snowflake-with-jdbc/

Data Valut Reference materials

Anand R Rao

9 年

Great Article Murali. In fact, we experimented with Data Vaults, it is really useful when you want to advance / extend data warehouse for future BI Analytics. It would be really great if you can throw some insights on 'What next after Data Vault 2.0" and how we can extend it further for future implementation ...

要查看或添加评论，请登录

Murali Krishna Vysyaraju (TOGAF Certified)的更多文章

The 7 Steps of a Data Project

2016年8月8日

The 7 Steps of a Data Project

Becoming data driven is about this: knowing the basic steps and following them to go from raw data to building a…

2 条评论
What Is the “Thing” in the IoT?

2016年7月27日

What Is the “Thing” in the IoT?

Everyone talks about the Internet of Things. And sure, you know what the Internet is (you’re soaking in it!).
Cloud Platform Comparison

2016年7月25日

Cloud Platform Comparison

Please refer the below url for complete information - https://endjincdn.blob.
Data Lake VS Data Warehouse

2016年7月22日

Data Lake VS Data Warehouse

Which Should You Choose? A core component of business intelligence, the data warehouse is a central repository of…

1 条评论
Apache Spark vs. Apache Drill

2016年5月25日

Apache Spark vs. Apache Drill

There are some similarities between the two projects. Apache Drill and Apache Spark are both distributed computation…
Internet of Things VS Internet

2016年5月25日

Internet of Things VS Internet
Azure Event Hub and Kafka

2016年4月23日

Azure Event Hub and Kafka

Any organization/ architect/ technology decision maker that wants to set up a massively scalable distributed event…

1 条评论
Hadoop and the Data Warehouse: When to Use Which

2016年3月25日

Hadoop and the Data Warehouse: When to Use Which

Hadoop and the data warehouse will often work togehter in a single information supply chain. When it comes to Big data,…

6 条评论
SQL Server database migration to SQL Database in the cloud

2016年3月2日

SQL Server database migration to SQL Database in the cloud

In this article you learn to how to migrate an on-premises SQL Server 2005 or later database to Azure SQL Database. In…
Spring XD: The Foundation for Real-time Streaming and Machine Learning Systems

2016年1月9日

Spring XD: The Foundation for Real-time Streaming and Machine Learning Systems

Spring XD addresses the new demands of big data and real-time data pipelining, but it sets a foundation for much more…

See all articles

Data Vault Modeling

Murali Krishna Vysyaraju (TOGAF Certified)

Assistant Vice President - Genpact

Murali Krishna Vysyaraju (TOGAF Certified)的更多文章

社区洞察

其他会员也浏览了

Data Mapping Tool Implementation

HOOK

We Don't Have Time for Data Modeling!

Data Modeling Tools

Data Modeling: Driven by the Business Value

CANONICAL DATA MODELING (CDM)

Data Layers