Data as a Product - Data architecture the management blueprint
In today's work the goal of every organisation is to ensure that data is managed properly which meets their business needs for information and more. Hence they invest to strong create data architecture, which is a discipline that documents an organisation's data assets, maps how data flows through its systems and provides a blueprint for managing data.
While data architecture can support operational applications, its output includes a multilayer framework for data platforms and?data management?tools, as well as specifications and standards for collecting, integrating, transforming and storing data. It most prominently defines the underlying data environment for business intelligence (BI) and advanced analytics initiatives.
Ideally, data architecture design is the first step in the data management process. But that usually isn't the case, which creates inconsistent environments that need to be harmonised as part of a data architecture. Also, despite their foundational nature, data architectures aren't set in stone and must be updated as data and business needs change. That makes them an ongoing concern for data management teams.
Data architecture goes hand in hand with?data modeling, which creates diagrams of data structures, business rules and relationships between data elements. They're separate data management disciplines, though.
This article on data architecture further explains what it is, why it's important and the business benefits it provides. You'll also find information on data architecture frameworks, best practices and more.
Evolution of data architectures
In the past, most data architectures were less complicated than they are now. They mostly involved structured data from transaction processing systems that was stored in relational databases. Analytics environments consisted of a?data warehouse, sometimes with smaller data marts built for individual business units and an operational data store as a staging area. The transaction data was processed for analysis in batch jobs, using traditional extract, transform and load (ETL) processes for data integration.
Starting in the mid-2000s, the adoption of?big data technologies in businesses?added unstructured and semistructured forms of data to many architectures. That led to the deployment of?data lakes, which often store raw data in its native format instead of filtering and transforming it for analysis upfront -- a big change from the data warehousing process. The new approach is driving wider use of ELT data integration, an alternative to ETL that inverts the load and transform steps.
The increased use of?stream processing?systems has also brought real-time data into more data architectures. Many architectures now support artificial intelligence and machine learning applications, too, in addition to the basic BI and reporting driven by data warehouses. The shift to cloud-based systems further adds to the complexity of data architectures.
Another emerging architecture concept is the?data fabric, which aims to streamline data integration and management processes. It has a variety of potential?use cases in data environments.
Why are data architectures important?
A well-designed data architecture is a crucial part of the data management process. It supports data integration and data quality improvement efforts, as well as data engineering and data preparation. It also enables effective?data governance?and the development of internal data standards. Those two things, in turn, help organisations ensure that their data is accurate and consistent.
A data architecture is also the foundation of a data strategy that supports business goals and priorities. In an article on?key data strategy components, Donald Farmer, principal of consultancy TreeHive Strategy, wrote that "a modern business strategy depends on data." That makes data management and analytics too important to leave to individuals, Farmer said. To manage and use data well, an organization needs to?create a comprehensive data strategy, underpinned by a strong data architecture.
What are the characteristics and components of a data architecture?
As per principles of modern data architectures, it is important to include both data governance and regulatory compliance processes and the growing need to support multi-cloud environments. As per my observation data's potential business value will be wasted if a data architecture doesn't make it available for analytics uses.
"It's a cliché of modern data management that data is a business asset," You know that data that just sits there is only a cost center, requiring maintenance without providing any business benefits.
From a purist's point of view, data architecture components don't include platforms, tools and other technologies. Instead, a data architecture is a conceptual infrastructure that is described by a set of diagrams and documents. Data management teams then use them to guide technology deployments and how data is managed.
Some examples of those components, or artifacts, are as follows:
What are the benefits of a data architecture?
Ideally, a well-designed data architecture helps an organisation?develop effective data analytics platforms?that deliver useful information and insights. In companies, those insights improve strategic planning and operational decision-making, potentially leading to better business performance and competitive advantages.
Data architectures also aid in various other applications, such as the diagnosis of medical conditions and scientific research. Also it helps in improving data quality, streamline data integration and reduce data storage costs, among other benefits. It does so by taking an enterprise view compared to domain-specific data modeling or focusing on architecture at the database level.
Well constructed data architecture can offer businesses a number of key benefits, which include:
What are the risks of bad data architecture design
One data architecture pitfall is too much complexity. The dreaded "spaghetti architecture"?is evidence of that, with a tangle of lines representing different data flows and point-to-point connections. The result is a ramshackle data environment with incompatible?data silos?that are hard to integrate for analytics uses. Ironically, data architecture projects often aim to bring order to existing messy environments that developed organically. But if not managed carefully, they can create similar problems.
Another challenge is getting universal agreement on standardised data definitions, formats and requirements. Without that, it's hard to create an effective data architecture. The same goes for putting data in a business context. Done well, data architecture "captures the business meaning of the data required to run the organisation,". But failing to do so may create a disconnect between the architecture and the strategic data requirements it's supposed to meet.
Data architecture vs. data modeling
Data modeling focuses on the details of specific data assets. It creates a visual representation of data entities, their attributes and how different entities relate to each other. That helps in scoping the data requirements for applications and systems and then designing database structures for the data, a process that's done through a progression of conceptual, logical and physical data models.
Data architecture takes a more global view of an organisation's data to create a framework for data management and usage. Data models are a crucial element in data architectures, and an established data architecture simplifies data modeling. Below are few recommendations for data modeling:
领英推荐
Data architecture vs. information architecture and enterprise architecture
Difference between data architecture and information architecture?in enterprise applications is "Information is data in context,". "An information architecture defines the context that an enterprise uses for its business operations and management." A data architecture that delivers high-quality, reliable data is the foundation for the information architecture.
Meanwhile, data architecture is commonly viewed as a subset of?enterprise architecture?(EA), which aims to create an organisational blueprint for an organisation in four domains or more. EA also encompasses the following:
What data architecture frameworks are available?
Organisations can use standardised frameworks to design and implement data architectures instead of starting completely from scratch. These are three well-known framework options:
DAMA-DMBOK2.?The DAMA Guide to the Data Management Body of Knowledge is a data management framework and reference guide created by DAMA International, a professional association for data managers. Now?in its second edition?and commonly known as DAMA-DMBOK2, the framework addresses data architecture along with other data management disciplines. The first edition was published in 2009, and the second one became available in 2017.
TOGAF.?Created in 1995 and updated several times since then, TOGAF is an enterprise architecture framework and methodology that includes a section on data architecture design and roadmap development. It was developed by The Open Group, and TOGAF initially stood for The Open Group Architecture Framework. But it's now referred to simply as the TOGAF standard.
The Zachman Framework.?This is an ontology framework that uses a 6-x-6 matrix of rows and columns to describe an enterprise architecture, including data elements. It doesn't include an implementation methodology; instead, it's meant to serve as the basis for an architecture. The framework was originally developed in 1987 by John Zachman, an IBM executive who retired from the company in 1990 and founded a consulting firm called Zachman International.
Key steps for creating a data architecture
Data management teams must work closely with business executives and other end users to develop a data architecture. If they don't, it may not be in tune with business strategies and data requirements. Engaging with senior execs to get their support and meeting with users to understand their data needs are two of the?nine data architecture planning steps.
Developing a full-scale enterprise data architecture starts with several important steps that data architects must follow when devising a solid data architecture plan.
1. Socialise with senior leaders
As with any strategic technology initiative, the?value of developing a data architecture?must be effectively articulated and continually communicated to C-suite executives. Craft a message that demonstrates the benefits a data architecture brings to the enterprise. Identify and engage key stakeholders to gain their support.
2. Identify the data personas
An organisation's technology environment is driven by the information needs of data consumers. Application system custodians are accountable for the data sets their applications produce and use. Ascertain the people who create, store, update, read and otherwise touch data within the enterprise. Identify stereotypical personas and characterise them according to their data touch points.
3. Determine information requirements
Engage the data consumers to understand their business strategy and solicit their business requirements for data. Document how those requirements relate to the abstract data domains, such as "customer" or "product" data, and the discrete data sets these consumers currently use or anticipate needing.
4. Evaluate information risks
Identify and interpret?data governance?directives and how they relate to the handling, management and protection of data.
5. Assess the data landscape
Survey and document the name, location, owner, producer, consumers and contents of enterprise data sets. Classify each data set according to usage scenarios and sensitivity and collect this information in a data catalog.
6. Analyse the data lifecycles
Evaluate how data sets flow from their origination points to their final destinations. Document the?data lineage mapping of data pipelines.
7. Appraise the data infrastructure
Document the current state of data management in the enterprise and capture the current technology infrastructure -- what systems, database structures, data warehouses, data marts and operational data stores are used, whether they're on premises or in the cloud and, if the latter, the cloud service providers.
8. Do a?SWOT analysis
Synthesise the knowledge that has been collected and analyse the strengths, weaknesses, opportunities and threats. Identify the greatest opportunities for improvement.
9. Create a blueprint and roadmap
Devise a?blueprint for framing?the enterprise data architecture that summarizes the collected knowledge and highlights proposed deployment projects. Scope out a roadmap for the proposed projects across the near-, medium- and longer-term horizons.
Among other steps, I also recommend organisations do the following:
What are the different roles in data architecture design and development?
The lead role in data architecture initiatives typically goes to?data architects. They need a variety of technical skills, as well as the ability to interact and communicate with business users. A data architect spends a lot of time working with end users to document business processes and existing data usage, as well as new data requirements.
On the technical side, data architects create data models themselves and supervise modeling work by others. They also build data architecture blueprints, data flow diagrams and other artifacts. Other duties may involve outlining data integration processes and overseeing the development of data definitions, business glossaries and data catalogs. In some organisations, data architects also are responsible for designing data platforms and evaluating and selecting technologies.
Other?data management professionals?who often are involved in the data architecture process include the following: