Data Fabric - Lets Wrap Our Siloed And Legacy Technology In A Blanket To Make Access Easier
Do you have multiple sources of data (we all do); A distributed landscape of technology all holding bits of data, some duplicated and none or very few having a AI enabled data management solution linked to them; Are you struggling to integrate your data; Are you considering copying your data into a central location (lake, warehouse) to make access for analytics and insights easier to create?
Think again!
Think Data Modernisation!
Data modernisation is more about interoperability (a word I don’t like as its not business friendly).
Suggestions welcome for a better business friendly word to replace it.
Let’s use it in this case, as it does work for data fabric.
Interoperability is:
The interaction between two or more separate systems, products or group – think of your Fitbit. Health portals can be configured to allow patients to upload data from their Fitbit seamlessly. Regardless of brand.
Or
The ability of apps, equipment, products, and systems from different companies to seamlessly communicate and process data in a way that does not require any involvement from end-users. Just like all our banking information and credit card transactions are available across the financial industry to provide insights on your spending and borrowing habits.
Data fabric is designed to help organizations solve complex data problems and use cases by managing their data—regardless of the various kinds of applications, platforms, and locations where the data is stored. Data fabric enables frictionless access and data sharing in a distributed data environment.
Taking it one step further to a Smart Data Fabric.
Incorporating AI (Artificial Intelligence) and ML (Machine Learning) to enable self-monitoring and self- optimising to help deal with issues inherent to complex, multi- factor situations. At the same time, a smart data fabric also uses automation wherever possible, to reduce manual effort and to accelerate and enhance data management processes.
Always good to have that future vision in mind, though lets not get carried away just yet.
What Is Data Fabric
Well let’s start by saying it’s not Data Mesh!
Data Fabric and Data Mesh are very different concepts. That’s not to say they can’t complement each other or work together. THEY ARE DIFFERENT AND THE TERMS SHOULD NOT BE USED INTERCHANGEABLY!
I always think of data fabric as wrapping a blanket around your existing technology, not to keep it warm, but to enable it to be work seamlessly to enable access to the data regardless of where it is.
Data fabrics continuously identify and connect diverse forms of data from disparate sources to reveal unique relationship points between available data that's relevant to businesses.
A good way to think about data fabric would be to think about the human brain - it stores information in different parts and is also able to connect the right dots between different kinds of information stored in different places.
I like this image it clearly shows that data fabric pulls data from all your sources, through the fabric which does a number of things with the horizontal lines making it available to users to consumer for their various needs.
Personally I would add Data Quality and Data Governance to the horizontal lines. You need data management which includes data quality on more than your master and reference data. Think of all that transactional data that needs checking for quality to avoid mistakes and improve reporting and insight accuracy.
Then all you need is Data Literacy to support the maturity of your Data Culture, and you have everything you need to utilise your data for increased value, optimisation & efficiency, and risk reduction.
Why Is Data Fabric Important Now?
Data Fabric has been around for a number of years, I wrote about it back in 2017 (link to original blog below). Although potentially unrealistic back then. The technology needed to make fabric successful is much more mature now, making it a realistic option for the many not just the few. Though I add the warning that it’s not easy to implement successfully – nothing involving data ever is!
We are in a world with many, many systems: Legacy; Distributed technology from acquisitions and mergers; Cloud, On-Prem; Just general growth with every department wanting a different system. I’ve worked in organisations where, for whatever you want to do there was always a choice of 4 different system options. As no control has been initiated from above on buying decisions and direction for tool usage. Think of all those duplicated license fees and support contracts.
Although Data Fabric won’t help with your multiple license fees, it will enable you to connect to all your different and diverse systems to access the data with privacy and security managed and governed appropriately.
Benefits of a Data Fabric
The overarching nature of a data fabric makes it ideal for incorporating, accommodating, and enforcing policy and governance on an organisations data. This means that data professionals can incorporate compliance and governance requirements into data fabric policy. Thus, such things are baked into the data management and handling environment so they can neither be ignored nor abandoned. In making Data Management and Governance part of the DNA of the organisation and operating without it needing to be considered as an extra activity….the panacea for data leaders!
This provides an excellent way for organisations to mitigate substantial legal, financial, and reputation risks they might otherwise incur.
Beyond governance, another benefit is semantic enrichment. This is primarily about analyzing data to add or update meaning (metadata) to the content in a data fabric.
Interestingly, the data fabric can use what it already knows about existing data under its purview and control to enrich incoming data during the onboarding process. Likewise, the fabric can use what it learns from incoming data during that process to enrich existing data as well.
Semantic enrichment can even use AI and ML to better identify, tag, and label its data (both incoming and existing) as new insights and patterns emerge from its ongoing and never- ending analysis of its own data holdings.
For example, this identifying and tagging would help data be discovered in a data catalog, which is a key component in Data Mesh.
Self-Service Analytics
It’s hard to overstate the value of self-service from the data fabric for any organization. Here, self-service means that users with the right access to the fabric — especially its catalog, analytics tools, and data holdings — can find data, create and run their own analysis, request canned analyses for the data they select.
No filing job requests with IT and waiting for your turn to come before work gets going! It’s a dream come true that opens the door to experimentation and innovation...The art of the possible!
It helps organizations solve complex data problems by eliminating inefficient and manual data integration processes and provides business-ready data for analytics. It enables users to access and share data seamlessly, regardless of where it is stored.
"A data fabric utilizes continuous analytics over existing, discoverable and inferenced metadata assets to support the design, deployment and utilization of integrated and reusable data across all environments, including hybrid and multi-cloud platforms." Gartner
领英推荐
Data Fabric Vs. Data Virtualization
Data virtualization is one of the technologies that enables a data fabric approach. Rather than physically moving the data from various on-premises and cloud sources using the standard ETL (Extract, Transform, Load) processes, a data virtualization tool connects to the different sources, integrating only the metadata required and creating a virtual data layer. This allows users to leverage the source data in real-time.
The main differences from data fabrics are use cases. Data virtualization is used for reports, business analytics, and visualization. Data fabric is used to analyse huge amounts of data, including IoT analytics, data science, real-time analytics, global analytics, fraud detection.
In short, a data fabric is a powerful abstraction that covers the whole lifecycle for data — from intake, to enrichment, to application delivery, to storage and archiving, to retirement or deletion — within a single, consistent, policy-driven platform. Aren’t you just dying to have one of your very own?
Highlights Of Data Fabric
Here are a few quick points you must know about this emerging design concept:
Examples Of Data Fabric Use Cases
Creating a holistic customer view. Data fabric can enable organizations to bring together data from all interaction points with the customer. It develops better knowledge of the customer, thus propelling seamless real-time personalization and customization initiatives.
?For regulatory compliance, utilising the data fabric’s robust data governance capabilities, you can also track where data comes from, how it was aggregated, who viewed it, when, and so on. With an AI-enabled data governance policy enforcement, data fabric supports automated classification of data assets, and sensitive data detection and masking.
For enhancing enterprise intelligence, a data fabric architecture streamlines the consolidation of information from internal and external sources, providing a bird’s-eye view of their business with the possibility of drill-down and drill-through. This can improve self-service dashboard usage, making available an overview of the enterprise-wide sales during the last quarter, where a sales manager can spot a sudden drop in sales last month and in several clicks identify that the reason lies in the shipment delays due to a new carrier performing poorly. This way, without turning to IT teams, business users can analyze corporate performance and identify departments, teams or employees with highest and lowest KPIs, run risk analysis, work out detailed budget plans, and more.
Data fabrics are still relatively in their infancy in terms of adoption, but their data integration capabilities aid businesses in data discovery, allowing them to take on a variety of use cases. While the use cases that a data fabric can handle may not be extremely different from other data products, it differentiates itself by the scope and scale that it can handle as it eliminates data silos.
data fabric as a fluid piece of shapeless cloth touching all your data sources, types, and access points – or a blanket!
Key Components Of A Data Fabric
A data fabric abstracts away the technological complexities normally encountered during data movement, transformation and integration, making all data available across the enterprise.
A fabric is made up of components that can be selected and collected in various combinations. Therefore, the implementation of the data fabric may differ significantly depending on your needs. Let’s take a look at the main components of a data fabric which by leveraging data services and APIs, data fabrics pull together data from legacy systems, data lakes, data warehouses, SQL databases, and apps, providing a holistic view.
Data fabric architectures operate around the idea of loosely coupling data in platforms with applications that need it. One example of data fabric architecture in a multi-cloud environment may look like the following, where one cloud, like AWS, manages data ingestion and another platform, such as Azure, oversees data transformation and consumption. Then, you might have a third vendor providing analytical automation such as Alteryx. The data fabric architecture stitches these environments together to create a unified view of data.
That said, this is just one example. There isn’t one single data architecture for a data fabric as different businesses have different needs. The various number of cloud providers and data infrastructure implementations ensure variation across businesses. However, businesses utilizing this type of data framework exhibit commonalities across their architectures, which are unique to a data fabric.
More specifically, they have six fundamental components, which Forrester describes in the “Enterprise Data Fabric Enables DataOps” report. These six layers include the following:
Data Management layer: This is responsible for data governance and security of data.
Data Ingestion Layer: This layer begins to stitch cloud data together, finding connections between structured and unstructured data.
Data Processing: The data processing layer refines the data to ensure that only relevant data is surfaced for data extraction.
Data Orchestration: This critical layer conducts some of the most important jobs for the data fabric—transforming, integrating, and cleansing the data, making it usable for teams across the business.
Data Discovery: This layer surfaces new opportunities to integrate disparate data sources. For example, it might find ways to connect data in a supply chain data mart and customer relationship management data system, enabling new opportunities for product offers to clients or ways to improve customer satisfaction.
Data Access: This layer allows for the consumption of data, ensuring the right permissions for certain teams to comply with governance & regulations. Additionally, this layer helps surface relevant data through the use of dashboards and other data visualization tools
A data fabric is a composable, flexible and scalable way to maximize the value of data in an organization. It's not one tool or process, rather an emerging design concept that gives a framework to think about how to stack existing tools, resources, and processes.
That it's composable means that there's no fixed architecture specific to data fabrics, a data fabric can be designed as a response to priority data needs of an organization. Just like the visual imagery that the name commands, we can imagine data fabric as a fluid piece of shapeless cloth touching all your data sources, types, and access points – or a blanket!
Can your organisation manage the business demands of real-time connectivity, self-service analytics, automation, and universal transformations with the massive amounts of data that businesses can access to exploite to derive unique insights without a data fabric?
Although I remain an advocate of Data Mesh, for many reasons, you need to consider the benefits vs the current level of maturity of your organisation and the effort required to develop Data Products. Now Data Mesh and Data Fabric are not opponents but they are different and solve different problems.
A Data Mesh can operate from implementing a Data Fabric either before or after the Mesh.
Risks With Data Fabric
A rising concern for organizations is the threat to data security when data is being transported from one point to another in the data fabric. It is mandatory that the infrastructure for the transport of data embeds security firewalls and protocols to ensure safety from security breaches. With an increasing number of cyber attacks hitting organizations, security of data at all points in the data cycle is paramount.
Conclusion
In this article, we have explored what is a data fabric, its main components, and the benefits of fashioning one for your organization.
A data fabric should be used when an organization requires a centralized platform to access, manage and govern all data. The first step is to design a framework that makes sense for your organization. The next step is implementation, which involves deploying a platform that:
Take it one step at a time, start small, prove value and demonstrate success, before moving to the next. Rome was not built in a day and neither is data success!
?
References:
Search: HitachiVantara.com
Image credit:
https://health.clevelandclinic.org/brain-teasers-infographic/