Data Fabric – Collaborating and sharing across a Data Network
An artistic view of a Hyper Connected Data Network

Data Fabric – Collaborating and sharing across a Data Network

In the latest blog in our “Architecting a Data Network” series, we will discuss how we connect our growing number of data assets, customers, and users across and beyond the organisation.

Digital organisations have more data sources and more data consumers, many of which are from non-traditional sources and places. As consumers and organisations become more interconnected, we become more interdependent. When more and more of us are connected to each other, the effects will be more intensely felt everywhere, increasing the pace of change, which is at the heart of complex adaptive systems.

No alt text provided for this image

Fig 1 – Information Architecture foundation

An increasingly connected world has many more moving parts and is inherently more unpredictable. Data can assist us in better understanding what is happening in the connected world and help us predict what can happen. We need a way of managing data with agility and at an ecosystem scale. In short, how do we make trusted data available and mobilise it across the growing ecosystem?


How do we combine Agility and Scale for Enterprise Data Management?

?Over the last couple of decades, the industry has shifted back and forth between centralised data sharing (e.g., enterprise data hubs, service buses to data warehouses) and decentralised approaches (e.g., big data, schema on read to point to point architectures). We need to disrupt and evolve these data management patterns for the emerging growth in data creation, the increasingly velocity of data consumption and convergence that is requiring a more joined up data view of our connected world. To organise data management at scale then we need to re-balance the responsibilities between data producers and data consumers to enable distributed data sharing.

What we need is a data fabric architecture. Gartner defines a data fabric as “enabling friction-less access and sharing of data in a distributed data environment". We have extended this definition to

“On demand, unified, set of data management and integration tools and standards for integrating, sharing, and managing data”
No alt text provided for this image

Fig 2 – Data Fabric Architecture

Piethein Strengholt in his book Data Management at Scale, describes a vision for enterprises in a hyper connected world that is based on a scalable and highly distributed architecture that can easily connect data providers and data consumers while providing flexibility, control, and insight. This book (along with the other references below) has inspired this data management article and is a recommended read for all data architects.

The data fabric works in tandem with data management and integration services.?Together they make data quickly available and useable throughout the organisation, and beyond, from Application to Application, Analytics, AI, and all other data exchanges.

The data fabric approach provides a logical architecture layer for all data flows across data domains. It creates a holistic picture of all data flows, we know what applications are connected, what data is exchanged, how data is routed, what the data means, the linage for our data, what the data quality is, what consuming or providing role a data domain has.

An important architectural consideration is that the Data Fabric is designed as a ‘data fabric as a platform’ that is domain agnostic and abstract all the underlying complexity so that data management and integration components can be provided in a self-service style for distributed consumption. This is key for providing scale and allowing squads within those data domains to consume the services with autonomy.

Another key architectural consideration is setting up and monitoring Architecture Guard Rails, here we borrow from our Ecosystem Platform Thinking article on how we measure the health of our data fabric. The data fabric is expected to continue to evolve as we move towards a set of augmented data management services and therefore, we will use these feedback loops to learn and adjust our data fabric based on what is working well and what is not.

No alt text provided for this image

Fig 3 – Managing health of a Data Fabric

?????????????

Data fabric services

Data fabric service are a core architecture building block in a data strategy that complement the enterprise data platforms, Analytics and AI services. ?Its purpose is to provide agile data management for fast access and sharing of the different shape and size in the vast data assets across the extended organisation. It builds on and aligns with key architectural trends such as Service Architectures, Data Mesh, Domain Driven Design and Streaming integration.

Metadata capabilities are a key architectural consideration for becoming data driven, from governing, managing, using, accessing, and optimising data and the systems and processes that employ it.?The data fabric will be metadata driven and this demands that a broad a range of metadata is collected, managed, and used which goes beyond that traditionally needed for its administration.

No alt text provided for this image

Fig 4 – Data Fabric Services

Data fabric next generation?enterprise information management and integration services for the enterprise include: -

  • The Data Marketplace service provides rapid access to authorised data and is also a social space for data collaboration.
  • Metadata Services utilises and unifies and manages metadata for exchange and use.
  • Data & Information Modelling Services are a set of business-as-usual capabilities, supporting tooling & repository, plus standards that underpin the ongoing architecture of data.
  • Data Handling Services solves the problem of authorising / constraining data in accordance with policies.?This includes Privacy and Customer Data Handling but also extends to all other reasons why data’s location, use, visibility, and similar my need to be constrained.
  • MDM provides trusted, authoritative data source for important data used widely across the organisation. ?
  • Data Quality Services identifies, profiles, classifies, identifies movement / flow / lineage of data holdings throughout organisations infrastructure (including cloud and SaaS).
  • The Interoperability Service unifies the distributing and integrating data between domains in part through orchestrating the other EIM services. It is a highly governed and secure architecture that helps distribute data using a variety of patterns.
  • Data Retention & Archival Management Services support the retention and disposal management of data.
  • API integration services used to connect services and distribute smaller amounts of data for real-time and low latency use cases. It facilitates the create, update, and delete statements for microservices.
  • Batch data integration processing for high volume, slower moving data and complex processing, such as aggregation, multi domain integration and grouping.
  • Streaming integration services provide real-time, high-volume event and messaging streaming for asynchronous communication. This is the fastest growing integration service driven by the growth in machine-based data sources.
  • Data virtualisation integration services,?is a logical data layer for light integration of data across disparate systems, manages the unified data for centralized security and governance, and delivers it to business users in real time.

Metadata capabilities are the core?architectural foundation for becoming data driven, from governing, managing, using, accessing, and optimising data and the systems and processes that employ it.?The data fabric services will be metadata driven and this demands that a broad a range of metadata is collected, managed, and used which goes beyond that traditionally needed for its administration.?

Do you want to know more?

This is the 4th article in our Data Network blog series and will continue to evolve as we learn more about Architecting a Data network and welcome feedback. Check out these brilliant resources if you want to know more: -

References

  1. Architecting a Data Network to Connect our World – Jason B Perkins
  2. Data Management at Scale by Piethein Strengholt
  3. Enterprise Information Management architecture – Paul Gallop
  4. The art of Enterprise Information Architecture ?by Mario Godinez
  5. Deviate: The creative power of Transforming your Perception. By Beau Lotto

Swapnonil Mukherjee

Solutions Architect specialising on Digital Sovereignty, Data & Analytics, and Public Cloud Adoption.

3 年

It's great to see that you have put "Data Marketplace Services" at the top. Everything else in the diagram, are just supporting acts geared towards establishing this Market Place. Just to add. My strong belief is products hosted through Internal/Private Data Market Places will act as a fundamental unit of interaction and transaction between systems in a modern enterprise. I see a near-future state where the economics and demands of the marketplace determine which Data APIs or Data Extracts get built or deprecated. And customers (other business divisions, systems, product teams) financing the building and maintenance of such products. The other part is the organisational transformation. In terms of teaming, roles and the way we carry out software delivery. I am convinced that traditional methods of software delivery cannot support a dynamic Data Market Place.

Brett Marshall

Founder | AI Governance | Info & Cyber Sec | AI Researcher & Developer | AI Implementation | AI Sourcing | Preparing for AGI

3 年
回复
Marc P.

Senior Manager @ Lloyds Banking Group

3 年

Thanks for sharing! A great read.

Sarit Bose

Data & AI enthusiast | Driving innovation & growth

3 年

Loved reading it Jason. Many thanks

要查看或添加评论,请登录

Jason Perkins的更多文章

社区洞察

其他会员也浏览了