Hot off the Presses - Metadata Management, Creating Data Products, Responsible AI Ethics, and AI Data Pipelines

Hot off the Presses - Metadata Management, Creating Data Products, Responsible AI Ethics, and AI Data Pipelines

1. An Architectural View of Metadata Management

By Dave Wells

Most organizations today recognize the importance of active and disciplined data management. They view data as an asset and manage it with governance and architectural standards and controls. The problem is that, in contrast, most organizations are passive and casual about metadata management.?

Data teams typically look to data catalogs as the answer to metadata needs. Looking from an architectural perspective, it is clear that data catalogs are only part of the solution, and frequently they are also part of the problem. Organizations manage data as an asset, but view metadata simply as a by-product of data management processes. This “data is managed, metadata happens” approach is fraught with risk. As data management complexity continues to increase, metadata management has become an essential discipline.

In this article I present an early draft of Metadata Management Architecture co-developed with my eLearningCurve colleague, Olga Maydanchik. I offer this architectural view as a thinking tool – that is, ?a means to begin to understand the scope and complexity of metadata management. It doesn’t provide solutions for all of the metadata management challenges. It is a beginning – not an end – and a tool to start finding solutions to metadata challenges such as silos, disparities, self-service difficulties, and poor data catalog adoption.

A Macro View of Metadata Management Architecture?

Let’s begin with a big picture view of metadata management architecture. (See figure 1.) At the macro level, metadata management comprises three broad topics:

  • Metadata Subjects and Sources are things that are described by metadata (the subjects) and the things from which metadata is derived or created (the sources). These include the inventory of data that is managed by the organization and the processes by which it is managed.
  • Metadata Lifecycle?is the path that metadata follows from inception, through various stages of processing and management activities, to the point of consumption and use.
  • Metadata Management Processes and Products?are the tasks and activities performed to manage metadata and the tangible results of those tasks and activities.

Figure 1. Macro View of Metadata Management Architecture

>Continue reading here

2. How to Create, Govern, & Manage Data Products: Practices and Products You Need to Know (Special event for data leaders)

Data products promise to deliver high-quality data sets to business users on demand, fostering greater trust in data and higher levels of empowerment and self-service. But many companies struggle to understand not only what data products are, but how to create, govern, and manage them.

This CDO TechVent will dive into the practical implications of running an organization using data products. It will describe how a data product is different from a data asset and then describe how to create data products from data assets using a variety of tools and techniques. It will address organizational, architectural, and process considerations for delivering data products at scale. It will also review commercial products that help data producers create, publish, and monitor data products and data consumers browse, evaluate, and subscribe to data products.

You Will Learn:

  • The key components of a data product
  • How to build data products from data assets
  • How to create a data product mindset in your organization
  • How to select a platform to publish and consume data products
  • How to create data contracts and deliver on promises

>Register Now

3. The Opportunity and Risk of Generative AI, Part III: Responsible AI Ethics

By Daniel O'Brien

Ethics are meant to be good, yet when you talk about them people react like you’re a villain. You can almost hear the echoes of groans from the last time ethics was brought up in a meeting. This does not come as a surprise. Ethics involve ambiguous questions and morally loaded consequences. That’s why people tend to shy away from considering ethics and focusing on easier questions. “What is the expected return on this project?” and “How accurately will this model classify winners and losers?” are easier to answer than “How do we make this AI better at helping people without infringing on personal rights?”?

Responsible AI seeks to simplify things. It offers ethical frameworks and tools that operationalize abstract ethical principles to help create great AI solutions. This blog is the third in the series, “The Opportunity and Risk of Generative AI.” The first blog discusses the incredible potential of artificial intelligence, specifically the most recent developments in generative AI. The second blog delves into the legal considerations of AI and discusses how responsible AI can help manage regulatory compliance. I recommend you read both, but especially the second as it covers the legal aspects of responsible AI. This blog will consider the related ethical aspects of artificial intelligence through a responsible AI lens.

To recap, we currently face an arms race similar to the development of nuclear weapons and energy in the 20th Century. With incredible speed, companies are developing a new technology with incredible potential for both positive and negative impact around the world without much regulation. Accepting that we are unlikely to stop this development, we must understand and manage the risks of AI by using the best tools and systems. Responsible AI encompasses both the regulatory and ethical considerations of artificial intelligence and provides the framework to tackle the risks in each domain.?

Let’s consider how responsible AI ethics bring big, abstract goals to earth and align them with technical and business goals.

Ethical Principles of Responsible AI

A common set of principles helps align business and engineering decisions. The ethical principles of responsible AI tie these two viewpoints together with social goals. From a business perspective, these principles demonstrate a commitment to customer interests and increase customer trust. From a product development perspective, ethical requirements should be considered an enhancement or value-add to an AI system. From a regulatory viewpoint, these principles align with and strengthen compliance programs. The benefits are substantial and real, going far beyond the marketing strategy of saying “we are a good business because we make ethical products.” Some may say this taints the intent of a noble endeavor, but that’s fine. More importantly, these ethical principles can appeal to all stakeholders including those with ignoble motives.

>Continue reading here

4. Weighing the Risk and Reward of AI: A Non-Technical Guide for Business Leaders

By David Hendrawirawan

During CDM Media’s September 2023 Houston CDO and CIO/CISO Summit, I joined a group of business and IT leaders across various industries to share perspectives and best practices. I also participated in an executive dinner and roundtable to focus on Artificial Intelligence (AI), hosted by Advansappz. There is a general sense that while most organizations have not yet fully embraced or understood AI, we are at an inflection point. Some organizations want to become AI change leaders, some innovators, and others cautious optimists, but everyone expects that they will have to do something eventually to move up the adoption curve.

Thanks to a diversity of experts in the audience, we heard many points of view that others haven't considered. Since most attendees have IT or analytics backgrounds, we are supposedly more informed than other business peers. Yet it was concerning to observe that even among this group, we still grapple with the same fundamental questions, issues, and challenges concerning AI. Everyone is looking for a common and practical framework for non-technical business leaders to make fully informed decisions on the risks and rewards of AI.

AI Risk and Reward

An organization should not be quick to adopt AI, especially Gen AI, for many reasons. This new type of AI, perhaps more than predecessors, can inadvertently violate privacy, amplify bias, and lead to incorrect conclusions or information. Given its novelty, not all risks are yet known and measurable, and there's no clear regulation to govern it. Two risk characteristics of Gen AI are particularly problematic: the difficulty in explaining models and the hallucination problems (where an AI output seems believable to humans even when it is false). These raise the question of AI trustworthiness.

Others believe that they should fully embrace Gen AI because the potential benefits will be significant and disrupt traditional business models. Gen AI can improve internal process efficiencies by an order of magnitude. It will unlock new insights to increase market reach and improve product quality exponentially. And even though there are still significant risks and compliance issues to consider, the ease of access and public fascination with Gen AI creates a fear of missing out (FOMO) effect.

Whether, when, and how an organization should allow or adopt AI are not one-size-fits-all questions. The answer depends highly on industry, organizational maturity, risk appetite, and applicable regulations. Regardless of the company's current position on AI, everyone agrees that, eventually, every company will come to adopt AI and that the most prudent thing to do is to prepare by investing in data governance, security, and privacy capabilities.

Senior Management and Board Questions

The consensus among business leaders is that AI, especially Gen AI, is beneficial to nearly every business function in every industry, although the use cases and degree of benefit will vary. Despite the immense AI risks with privacy, bias, and security, the labor savings and economic benefits of large language models are real and readily demonstrable. As a result, there is a growing sense that the commoditization of Gen AI is inevitable. ?What top-of-mind questions are executive teams and their boards asking, or should they be asking, as they try to determine the strategic impact of AI for their organizations? Here are the common themes that we gathered from the roundtable group.

1. Can AI bring transformational benefits to my organization? How will it change my industry and competitive landscape?

Because AI needs a significant amount of training data as fuel, AI will benefit the companies that can have the scale and capability to manage big data. This includes companies such as FAANG (Facebook / Meta, Amazon, Apple, Netflix, and Google / Alphabet) and other technology leaders. Even for large enterprises outside of technology, this is a tall order. For small and medium companies, it seems even less likely that they can reap enormous benefits from AI. Furthermore, AI may alter the competitive landscape to gravitate to a monopolistic or oligopolistic structure. In every industry segment, it will favor the fewer and larger firms that can gain dominance in consolidating data and analytical capabilities.


AI benefits will favor the few larger firms that can gain dominance in consolidating data and analytical capabilities

Hence, the more relevant question for the board and executives is this: In a future where AI will become pervasive, what business model and operating framework will enable us to capitalize on it? What can companies do to stay competitive? One area worth consideration is "Privacy-Preserving Data Sharing and Analytics" (PPDSA), a national strategy proposed by the White House Office of Science and Technology Standards in March 2023.

PPDSA includes techniques such as differential privacy, homomorphic encryption, synthetic data, secure multiparty computation, and federated learning. They allow companies to explore, use, and share data securely and privately—without giving it away—in a raw, readable, and reusable form. Organizations can create partnerships and data marketplaces to enrich their training data, enabling them to produce AI models that have far greater stability, accuracy, and confidentiality. PPDSA allows all organizations to increase speed and scale to discover and access training data for AI. It has the potential to help smaller and medium enterprises compete against the big data incumbents.

>Continue reading here

5. Analyst Series: Should AI Bots Build Your Data Pipelines?

By Daniel O'Brien

Summary

  • Kevin Petrie, the Vice President of Research at Eckerson Group, and Dan O’Brien, research analyst, discussed large language models (LLMs), which are neural networks that analyze text to predict the next word or phrase. These models use training data, often from the internet, to understand word relationships and provide accurate answers to natural language questions.
  • Dan and Kevin discussed the use of LLMs as assistants in various data engineering tasks. They found that LLMs were most useful in tasks such as writing documentation, building sequences of tasks, and assembling starter code for data pipelines, but emphasized the need for careful inspection of their outputs.
  • Kevin discussed the costs and benefits of large language models. He mentioned that while the productivity benefits were significant, the costs included risks such as lack of explainability, privacy concerns, data quality issues, handling of intellectual property, and potential bias.
  • Dan and Kevin discussed the concept of small language models (SLMs) compared to large language models. They concluded that small language models can be fine-tuned on domain-specific data, enriched with context and information, and augmented with outputs from other models to achieve accurate and efficient results for specific tasks.
  • Dan thanked Kevin for his insights and recommended readers to check out Kevin's blog series on their website. Kevin suggested following him on LinkedIn for daily updates and expressed interest in hearing the stories of software startup founders in the space.

>Listen to the podcast episode here


About Eckerson Group

Eckerson Group is a global research and consulting firm that focuses solely on data analytics. Our experts have substantial experience in data analytics and specialize in data strategy, data architecture, data management, data governance, data science, and data analytics.

Our clients say we are hard-working, insightful, and humble. It stems from our love of data and desire to help organizations optimize their data investments. We see ourselves as a family of continuous learners, interpreting the world of data and analytics for you.

Get more value from your data. Put an expert on your side.?Learn what Eckerson Group can do for you!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了