What does it take to be a data-driven organization?
Rahul Pandey
Unlocking Business Potential with AI Solutions | Senior Solutions Architect @ adidas | Certified Expert in Databricks, AWS & GCP | Writer & Speaker | MLflow Ambassador ??
According to PwC, AI will contribute up to $15.7 trillion to the global economy by 2030. The potential of AI to revolutionize human lives and work is enormous, but it's essential to develop effective strategies to manage the complexity associated with it. This includes addressing governance challenges, having a solid data strategy foundation plan, scaling, and model management while considering social and environmental impacts. Failure to establish strong foundations to manage these challenges can lead to numerous setbacks, such as delays in delivering business value and increased costs.
Let's delve into strategies that can help companies become genuinely data-driven by leveraging AI technologies to drive scalable and well-governed businesses.
Promote Data-first culture
Imagine you’re trying to turn a boat around. Simply focusing on the engine won’t work. You need to consider all the factors, like wind, current, and direction of travel. Similarly, investing in the right tools or technologies is just the beginning. The key to success is an organization-wide shift in culture, with change management playing a critical role in becoming a truly data-driven company.
Companies with data-driven cultures tend to have a mindset that sets an expectation that decisions must be anchored around data — that this is normal, not novel or exceptional. It requires employees to be data-literate and to have access to the data they need to do their jobs effectively. It also requires a commitment to continuous learning and experimentation.
Fostering a data-first culture requires the following steps:
It's important to know how to measure the success of an organization-wide data-driven culture initiative. One way to do this is to analyze the teams' approaches over time to drive business value. The figure below illustrates this concept.
At a minimum, organizations should provide their employees with the necessary tools and support to operate on an insight level.
Platform architecture
Data is the core of digital transformation and should be treated as a valuable strategic asset that can help to improve processes, identify seasonal trends, enhance customer experience, detect unexpected spikes in sales, and much more.
It is proven that data-driven companies innovate faster and have a competitive advantage where data platform architecture plays an important role. With the exponential growth of data, new platform architectures have been introduced, as shown below [2].
A lakehouse is a unified data architecture that combines the flexibility and scalability of a data lake with the performance and ACID transactions of a data warehouse. This architecture enables organizations to store, manage, and analyze all data types in a single location, including structured, semi-structured, and unstructured data. It's important to note that there is no one-size-fits-all solution when choosing a data platform architecture. Experts must assess an organization's data landscape to identify the best solution. However, it's crucial that the architecture design selected can effectively store, manage, and accelerate advanced analytics, as this remains a critical aspect of any successful AI strategy.
Data and AI governance framework
Generative AI and Large Language Models (LLMs) are changing how organizations create content, simulate scenarios, and make decisions. However, these advanced technologies raise concerns about data privacy, bias mitigation, and ethical considerations. Having solid data and AI governance is crucial when using these technologies. Gartner predicts that by 2025, 80% of organizations expanding digitally will face obstacles due to outdated data and analytics governance practices. The figure below displays the key areas for tackling data and AI governance.
Data is the key to digital transformation, so data governance is essential for any organization that wants to protect and use data responsibly. Organizations consider unified platform architecture like Lakehouse to simplify governance [3]. This shift aims to move away from segregated environments with separate governance controls and towards unified platforms that make it easier to understand and protect data and AI models. An effective data governance framework should be able to answer questions about:
Organizations must apply the guiding principles of accountability, standardization, compliance, quality, and transparency to govern AI effectively. AI governance frameworks should address all the potential risks and allow an organization’s teams to leverage AI’s full potential to drive innovation and achieve their business goals. An effective AI governance framework should be able to answer questions about:
Creating a robust data and AI governance framework is crucial to achieving success in advanced analytics while mitigating future safety, privacy, and compliance risks.
Treating data as a product
Since data is consumed across organizations by several teams, it is beneficial for organizations to treat datasets as products. This involves having a product owner who understands the customers and users, identifies the problems they are trying to solve, decides how to market the product, and ensures that it is reliable and valuable. The data product owner aims to deliver data that meets the highest standards and has the following characteristics:
In addition to these characteristics, data should also be:
One way to achieve this is to implement a Data Mesh on Lakehouse to remove the need to copy data to multiple analytical systems and integrate multiple analytical workloads [4].
Data mesh is a decentralized architecture approach that produces trusted, reusable data products. The four fundamental principles of data mesh are:
领英推荐
Data mesh helps organizations get the most out of their data, leading to achieving data economy. This allows teams to improve and enrich existing data, train machine learning models on more diverse datasets, and combine data from across the organization to find new insights, build better products, and drive innovation.
Decouple platform and data ownership
The traditional approach of a centralized team owning all of the data analytics for an organization has several points of failure. For instance, this team could face resource constraints, making delivering data requests on time challenging. This may lead other teams to create their infrastructure, which can compromise data governance and collaboration. Therefore, the role of data platform teams should be more central. Instead of being tightly coupled with product teams, data platform teams should operate more independently and focus on providing central support for data tools, templates, support systems, and governance practices.
Product teams should focus on their core competencies to achieve effective and efficient data product development while letting data platform teams handle the platform requirements and user journeys. Data platform teams can focus on creating an environment that supports data product teams by providing a portfolio of tools and reusable assets to improve efficiency and effectiveness.
Establish a Center of Excellence (CoE) for AI
Organizations adopting AI can significantly benefit from establishing AI Centers of Excellence (CoE). By bringing together AI experts and creating a collaborative environment, AI CoEs can help develop and implement best practices and provide training and support to accelerate the adoption of AI.
An AI CoE can be a central team that provides a clear overview of the organization's AI landscape. It offers recommendations on tools, technologies, required skills, and strategies. However, every investment in AI requires key performance indicators (KPIs) to measure success. In this case, the KPIs could include use case identification, delivery time, and the impact of AI solutions. An AI CoE should focus on key areas and KPIs to ensure the success of the organization's AI initiatives.
Key areas:
KPIs:
Understanding that the AI CoE should operate in collaboration is essential. It should collaborate closely with other departments and teams within the organization to create and execute AI solutions that cater to the business's requirements. The AI CoE should also maintain transparency and inform the organization about its activities.
Promote low-cost and sustainable development
Organizations that depend on data increasingly turn to artificial intelligence (AI) to enhance their decision-making abilities and streamline operations. However, creating and implementing AI solutions require a lot of computational power and energy, which can have a huge environmental impact.
Here are some ways that data-driven organizations can think more sustainably and be cost-effective:
Here are some fundamental strategies to promote sustainable AI in a large organization:
#Data #DataMesh #DataDriven #AI #COE
References
[1] Tekiner, F., & Bak, J. (2023, March 23). New whitepaper explores three pillars of a modern data strategy. Google Cloud Blog. https://cloud.google.com/blog/products/data-analytics/new-whitepaper-explores-three-pillars-of-a-modern-data-strategy/
[2] Michael Armbrust1 , Ali Ghodsi1,2 , Reynold Xin1 , Matei Zaharia1,3 1Databricks, 2UC Berkeley, 3Stanford University. Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics
[3] A comprehensive guide to data and AI governance. (n.d.). Databricks. https://www.databricks.com/resources/ebook/data-analytics-and-ai-governance?scid=7018Y000001Fi0wQAC&utm_medium=paid+search&utm_source=google&utm_campaign=15638819267&utm_adgroup=135098872526&utm_content=ebook&utm_offer=data-analytics-and-ai-governance&utm_ad=666067175996&utm_term=data%20governance&gad_source=1&gclid=Cj0KCQjwy4KqBhD0ARIsAEbCt6gzRrArgQRUhG5ziek9z8MUaBPA9nIHw4mktctUIrwtCjZbTQF_KFEaApMIEALw_wcB
[4] Best practices for implementing data mesh on the lakehouse. (n.d.). Databricks. https://www.databricks.com/resources/whitepapers/best-practices-implementing-data-mesh-lakehouse
[5] AI and compute. (n.d.). https://openai.com/research/ai-and-compute
[6] Gupta, A. (2021, December 6). The imperative for sustainable AI systems. The Gradient. https://thegradient.pub/sustainable-ai/
Senior Solution Architect - Data and Analytics
5 个月Very detailed and insightful