How can you address data silos?
Banner was created using canva.com

How can you address data silos?

You know the data silos aren’t desirable to an organisation. But do you know, why are they not desirable? How do you wish to address them? Let’s discuss them in this article.

1. Drawbacks of data silos:

Quality is determined by accuracy and completeness. - Larry Sanger

1.1 Holistic view: Isolated data doesn’t talk to the rest of the applications in the organisation. Therefore, you will miss the enterprise-wide view of your business, which will impede the decision-making process.

1.2 Data integrity: You will have issues with the accuracy & consistency of the data at the enterprise level because of data silos.

Culture is what people do when no one is looking: Gerard Seijts

1.3 Culture silos: My business is very demanding. I can’t go at the pace of my organisation. In an agile world, I can accomplish my business demands quickly by doing them alone. I am comfortable doing it alone as I get a lot of autonomy. Have you heard these remarks from your colleagues? The question here is ...Does the data silo create this culture or does the culture create the data silo? The answer is that both are causing each other and it will become a vicious circle.

1.4 Resources: You will incur additional upfront capital expenditure and a growing maintenance cost for your data silo.

Bargaining has neither friends nor relations. - Benjamin Franklin

1.5 Bargaining power: You will lose bargaining power with the product vendors, which the organisations enjoy because of the economies of scale.

2. Approaches to minimise data silos:

You cannot eliminate data silos 100%. Teams don’t plan and create a data silo as a primary outcome of any project, but they create them as an inevitable by-product of a bigger transformational initiative.

Sometimes, you would have taken a conscious decision to address the data integration as a separate project, but you never got funding for the same because of other business priorities. Hence, the data silos became inevitable. But, the good news is that we can certainly minimise them as much as we can.

Technically, there are two medium-term to long-term approaches namely Data Integration and Data Virtualisation to address data silos. Change in culture plays an important role regardless of what technical approach you wish to take. Let's discuss all of them.

2.1 Data integration:

Data integration is the process of combining data from different sources into a single, unified view. The consolidated data is physically stored in another data store.

No alt text provided for this image

2.1.1 ETL: Traditionally, ETL (Extraction Transformation and Loading) process has been used to create a centralised physical data store called Operational data stores (ODS) to bring data into a single place and then use it as input to the data warehouse. Typically, transformation in ODS is done just to match and integrate data from different sources.

2.1.2 On-prem data lakes: On-prem big data platforms are used to ingest data from all sources into one data lake and integrate it to provide a consolidated enterprise-level view.

2.1.3 Cloud solutions: Recently, Organisations have started adopting cloud solutions given their advantages.

There is no right or wrong approach and likewise, there is no universal solution for data integration. The selection of the data integration approach and the tooling broadly depends on Enterprise-wide data strategy, Real-time or batch, On-prem or Cloud and Ongoing maintenance costs.

2.2 Data Virtualisation:

Data Virtualisation is the process of combining data virtually from different sources into a single, unified view. The data remains in the source system itself and is not replicated anywhere else. We will discuss data virtualisation in future editions.

2.3 Culture:

No alt text provided for this image

2.3.1 Enterprise-wide data strategy, a road map and governance around data initiatives are essential for bringing a change in the culture at the organisation level.

2.3.2 A reporting structure that empowers the Chief data officers & data teams to work with the business jointly helps understand regulatory constraints, data sharing and gaining funding quickly.

No alt text provided for this image

2.3.3 Business often finds the IT processes too complex to navigate through. Having regular connections with the business and sharing the IT processes serves to bridge the gap and plan accordingly. This will minimize the need to create data silos as part of bigger initiatives.

Hope this gives a high-level understanding of the pitfalls of data silos and how you can minimise them. I have attempted to minimize the usage of jargon and focused on concepts. Thanks for reading. If you find this article useful, please like, share and comment.

Views are personal and in no way reflect my current & previous organisations and vendor partners.————————————————————————————————————

Image credit:

  1. Photo by Andrea Piacquadio from Pexels
  2. Photo by Ivan from Pexels
  3. Photo by Andrea Piacquadio from Pexels

References & Additional Reading:

  1. https://www.dhirubhai.net/pulse/six-reasons-organisations-have-data-silos-sujithkumar-chandrasekaran/
  2. https://www.talend.com/resources/what-is-data-integration/
  3. https://www.techtarget.com/searchdatamanagement/definition/data-silo
  4. https://www.techtarget.com/searchoracle/definition/operational-data-store
  5. https://www.javatpoint.com/cloud-computing-data-virtualization
Abhilash Neeraty

Senior Solutions Architect at Wipro Limited

2 年

Sujit, A very good summary as always! One of the most important questions that we need to ask as data architects is what can be brought under one governance umbrella!! How strong is the enterprise wide data strategy if there is one defined!! In order to implement this vision we usually scramble on data sources , meetings with business , understanding various flavours of databases Nosql, mysql, bigdata.. good time and money goes into this effort.. To me this is kind of swimming against tide..Instead we need accept data in silos and try to look at this problem from a different lens.. 'Data Catalog' is the first step in this process.. to a large extent a catalog can drive data strategy at each business unit level in the first place..

回复

要查看或添加评论,请登录

SujithKumar Chandrasekaran的更多文章

  • GDPR in 3 mins - 1 of 7 Principles

    GDPR in 3 mins - 1 of 7 Principles

    Having gone through the scope and objective in our earlier Newsletters, let us discuss the protection and…

  • GDPR in 3 mins - Objective & Rights it protects

    GDPR in 3 mins - Objective & Rights it protects

    Understanding the legal terms is difficult for an Engineer like me. However, I attempted my level best to simplify by…

    1 条评论
  • GDPR in 3 mins - Scope & Definitions

    GDPR in 3 mins - Scope & Definitions

    The General Data Protection Regulation (GDPR) is the world's strictest data privacy and security law. This law was…

    1 条评论
  • Are you becoming a Chicken ?

    Are you becoming a Chicken ?

    I had never taken Tea or coffee until I went to the university and started to stay in the hostel. That was because my…

    3 条评论
  • Differential data privacy - an Overview

    Differential data privacy - an Overview

    Customers' data is private, and the data analyst can't use this sensitive information. But then, the Dataset is full of…

  • Differential Data privacy - demystified

    Differential Data privacy - demystified

    One of the critical challenges data practitioners face is that we expect them to provide vital information without…

    1 条评论
  • Model extraction using Active Learning

    Model extraction using Active Learning

    Most cloud service providers offer Machine Learning as a Service (MLaas). By the way, what is MLaaS? As the name…

  • Data Free Model Extraction Attack

    Data Free Model Extraction Attack

    Before we start discussing the data-free model extraction attack, let us understand how the Model extraction typically…

  • I know what you did last summer

    I know what you did last summer

    You had a common business problem across the industry. So you, as a CDO, secured funding from the Business to develop a…

  • Adversarial attacks on "Explanation models"

    Adversarial attacks on "Explanation models"

    Before we start our discussion on attacks, let us understand the explanation model, why we need it in the first place…

社区洞察

其他会员也浏览了