Lifting the Data Curtain and the Necessity of Organization-wide Data Visibility

No alt text provided for this image

This post is my personal view of the current Data Landscape, shaped by insightful conversations and mentorship experiences with a diverse array of stakeholders, peers, friends, and leaders in the field. It is important to clarify that this post does not, in any shape or form, indicate or point to specific challenges faced by individual companies. Instead, it reflects my perspective on the common challenge(s) that leaders in Data and Analytics are encountering today, as well as what can be done to effectively overcome these hurdles. The “IT” I am referring to is akin to a curtain that separates various activities, processes, and data between the data producers—such as products and users—on one side, and the data consumers, which include decision-makers and business functions (Analytics, Data, AI teams) within an organization, on the other side.

In the contemporary discourse surrounding data, we often hear and see a plethora of articles, posts, and blogs proclaiming statements like “Data is the NEW oil” or “All companies are Data Companies.” While I wholeheartedly agree with these assertions, the fundamental perspective I would like to emphasize is the notion of Data as a “Company Asset.” In other words, data has evolved to become not just another asset for a company, akin to finances, staff, and physical resources, but, more importantly, there exists a significant “Curtain” that separates the data producers within an organization from the data consumers. This separation leads to a blurring of lines regarding what the product and engineering teams, who are responsible for producing data, perceive, and what the BI, Analytics, AI, and Business teams see as they leverage the internal data that has been produced—the curtain.

This separation creates a gap filled with numerous assumptions and a certain level of imagination that both sides of the data stream—upstream and downstream—are relying on as we navigate our operations. I am fully cognizant of the fact that engineering, business (including marketing, finance, sales, operations, etc.), and product teams have their own primary focuses and objectives. However, it is crucial to recognize that data serves as the enabler that can bridge these gaps. Just like the saying goes, “In God we trust; the rest bring data.” Therefore, as Data Strategists and leaders who aspire to holistically enable companies to thrive, we must take proactive steps to remove this curtain. The good news is that most solutions to these challenges are not magical or out of reach; rather, they stem from fostering collaboration and leveraging existing solutions effectively.

On the Data producers’ side, for instance, engineering and product teams are demonstrating tremendous capability and dedication as they work diligently to create the best products that meet the needs of the organization and its customers. This responsibility means that, in addition to dealing with the intricate technical details, these teams must also recognize their role in the Data Asset conversation. This includes taking ownership of aspects such as Data Quality. It is essential to understand that Data Quality does not merely refer to ensuring data validation—such as checking data types, values within defined ranges, and so forth—but also encompasses the need for these teams to take ownership of some level of data verification.

Data verification implies that all data produced or captured must not only serve the immediate function of the product but also align with and support downstream needs, such as business enablement and organizational goals. This alignment is critical when defining the data, datasets, architectures, and other related components. By ensuring that the data produced is of high quality and relevant to the business objectives, the engineering and product teams can significantly enhance the overall effectiveness of the data ecosystem within the organization.

Moreover, it is vital for data producers to engage in open communication with data consumers to ensure that the data being generated is not only useful but also actionable. This collaboration can help to demystify the data landscape and foster a culture of transparency and trust between the two sides. By breaking down the barriers created by the curtain, organizations can create a more cohesive and integrated approach to data management, ultimately leading to better decision-making and improved business outcomes.

The current Data Landscape presents both challenges and opportunities for organizations. By recognizing the importance of data as a company asset and addressing the separation between data producers and consumers, we can work towards creating a more unified and effective data strategy. It is through collaboration, communication, and a shared commitment to data quality that we can truly harness the power of data to drive innovation and success in our organizations. As we move forward, let us strive to remove the curtain that separates us and embrace a more integrated approach to data that benefits everyone involved.

On the Data consumer side, business teams and enablement functions of BI/Data are expending tremendous effort and making significant investments to translate, rework, and often re-engineer data in order to answer critical questions in a trusted and reliable manner. This process is often undertaken with the assumption that the data they are working with is clean and accurate. However, this assumption can lead to a culture where we do not look behind the curtain, where we simply accept the data we have or the data we are receiving without questioning its integrity or relevance. Consequently, the consumers of this data are acutely aware of the business needs that must be addressed, as well as the longer-term strategic steps that would enable organizations to achieve their goals. This wealth of insights and perspectives regarding organizational operations remains firmly on the consumer side, often unshared and unexamined.

To summarize, the “curtain” I am alluding to represents the gap that is created by the collation of critical contextual information that is partitioned on either side of this divide, compounded by a lack of sharing or misalignment in timing. At this juncture, let me set the stage by recapping that a high-level data flow, in its simplest form, consists of the large amounts of data that are produced or collected through the consumption and operations of an organization. When we think about the data flow, it broadly consists of several key steps: a feature or product is built to meet a specific need; once deployed, it generates data; this data is then transferred downstream and consumed, ideally driving informed actions and decisions. However, this simplistic flow is missing a crucial set of components: aside from merely meeting product needs and goals, producers often lack awareness of what business needs the data could potentially fulfill, while consumers may be unaware of what data is available at the producer's end.

The main way, in my view, to effectively remove the curtain that separates these two sides is through the concept of “Data Visibility.”

“Data Visibility” very simply means the ability for every authorized and approved stakeholder within an organization to consistently answer the following critical questions at all times:

  1. What data is available, what is its source, and where is it stored?
  2. How is that data defined, and what does it mean in the context of our operations?
  3. Where is the data utilized, and how is it being used to drive decisions?
  4. Who are the experts within the organization that can assist in answering data-related questions?

When EVERY stakeholder in an organization can confidently answer these questions, we have effectively removed the curtain and achieved a state of data visibility. So, how do we go about achieving this level of data visibility?

Addressing the above questions is a substantial topic that hinges on the existing infrastructures and practices within an organization. I am open to engaging in detailed discussions with individuals who are interested in exploring this further offline. For now, I would like to propose two key approaches. Those who are familiar with my work know that I will examine the solution from multiple perspectives: the data itself, the technologies we can leverage, the people and skill sets that need to be utilized, and the processes, strategies, and governance required to make this happen.

On the process side, when a new feature or product is being defined, it is essential to include an “Analytics/Data” section in the product specification. This section should capture not only the measures that the product would utilize but also the measures that business teams could employ to ensure the success of the feature or product. By doing so, we create a framework that encourages collaboration between data producers and consumers from the outset, ensuring that both sides are aligned in their understanding of the data's purpose and potential applications.

Furthermore, it is crucial to foster a culture of continuous communication and collaboration between data producers and consumers. This can be achieved through regular meetings, workshops, and collaborative platforms where both sides can share insights, challenges, and opportunities. By breaking down the barriers created by the curtain, organizations can create a more cohesive and integrated approach to data management, ultimately leading to better decision-making and improved business outcomes.

The current Data Landscape presents both challenges and opportunities for organizations. By recognizing the importance of data as a company asset and addressing the separation between data producers and consumers, we can work towards creating a more unified and effective data strategy. It is through collaboration, communication, and a shared commitment to data quality that we can truly harness the power of data to drive innovation and success in our organizations. As we move forward, let us strive to remove the curtain that separates us and embrace a more integrated approach to data that benefits everyone involved. By doing so, we can ensure that data serves as a bridge rather than a barrier, enabling organizations to thrive in an increasingly data-driven world.

This proactive contribution by consumers, even before the product is fully developed, will empower producers to gain a much deeper understanding of broader needs and expectations. This collaborative approach allows them to work towards the production of data that is not only relevant but also insightful. As consumers engage in this process, they contribute valuable business know-how and become increasingly sensitive to the technical challenges that producers face during the production phase. This synergy between consumers and producers will yield more holistic datasets, which can be leveraged to ensure data quality right at the source. By addressing potential issues early on, organizations can significantly reduce the expensive efforts and rework that often arise after downstream identification of necessary enhancements.

Another effective solution is to deploy advanced technologies that serve as the single source of truth for data visibility, such as data catalogs and data observability tools. These tools, when populated in a disciplined, routine, and structured manner, provide a robust platform that can be utilized to collate and document essential data, sources, metadata, and more. They also facilitate the identification of data stewards—individuals who can represent and advocate for the data—and offer interfaces for search and interrogation, which are crucial for answering the aforementioned questions. Therefore, if organizations are not investing in such transformative technologies, it does not matter how proficient their downstream teams are; the overall data maturity of the organization will inevitably stall. Moreover, we have yet to address the compliance and regulatory nightmares that can emerge downstream, which can further complicate the data landscape.

While there are numerous steps required to successfully lift the curtain and achieve true data visibility within an organization, it fundamentally necessitates the most challenging aspect of any change initiative—the Cultural Change. The entire organization must be committed to playing their respective roles and owning their part of the data asset responsibility on a daily basis. This principle applies universally, regardless of the organization's size, from nimble startups to sprawling enterprises. The era of data being treated as a stepchild or an afterthought is long gone. Data enablement teams, while they play a crucial role in presenting data to support decision-making, cannot bear the entire responsibility alone. While Data Science may be an exciting field, without proper data visibility, it will ultimately run out of clean data and deliver diminishing returns on impact.

To truly harness the power of data, organizations must embrace a culture that prioritizes data visibility and collaboration. By fostering an environment where every stakeholder understands their role in the data ecosystem, organizations can ensure that data becomes a valuable asset rather than a hindrance. This cultural shift is essential for driving innovation and achieving sustainable success in an increasingly data-driven world. As we move forward, let us commit to removing the curtain that separates us and work towards a more integrated approach to data that benefits all stakeholders involved. By doing so, we can transform data into a bridge that connects insights and actions, enabling organizations to thrive in the modern landscape.

LOUIS HAUSLE

Sales Director - Launching MetaKarta - Data Catalog|Data Governance|Data Lineage

3 个月

Great insights, Amit! How do you envision data quality and verification evolving to better serve data consumers in the future?

赞
回复
Tim Frenzel

Investment Management | Data Science | Applying AI capabilities | Professor in ML & Finance

3 å¹´

Amit Shivpuja! You’re (hopefully) knocking on open doors when it comes to data producers and data product managers. I enjoyed reading your suggestions regarding the single source of truth of data and defining an analytics-data section in each product cycle. You also brought up data quality and verification in the context of meeting downstream needs to achieve the "Holy Grail" of data visibility. I’d love to hear your thoughts on this in more detail, particularly for the data consumer.

Alex Salazar

Co-Founder/CEO Arcade.dev, Helping AI Agents Take Real Action

3 å¹´

Great post!

Adhar Walia

AI Product Management Leader | Agentic AI | Generative AI | Artificial Intelligence | Digital Transformation | Building and Scaling B2C and B2B Gen-AI and Agentic AI products

3 å¹´

Great post Amit Shivpuja include Analytics/ Data in Product specifications is a must for Product Managers

Pallavi Karanth

Passionate about Semantics, Symbolic AI, Analytics, Competitive Intelligence

3 å¹´

Great read Amit Shivpuja!

要查看或添加评论,请登录

Amit Shivpuja的更多文章

社区洞察

其他会员也浏览了