Enterprise Data World 2024 Takeaways: Trending Topics in Data Management. Part 3
Dr. Irina Steenbeek
Data Management Practitioner & Coach | Data Management and Governance Frameworks | DM Maturity Assessment | Data Lineage | Metadata | Keynote Speaker | Author: The O.R.A.N.G.E. Data Management Framework & 4 books
I was privileged to deliver a workshop at the Enterprise Data World 2024.
Publishing this review is a way to express my gratitude to the fantastic team at DATAVERSITY and Tony Shaw personally for organizing this prestigious live event.
Part 2 of this article has discussed the trending topics in data architecture and modeling.
Part 3 will discuss the key trends in applying artificial intelligence to data management practices and developments in other areas of data management, such as data quality, master and metadata management, and data visualization.
Artificial Intelligence in Data Management
Artificial intelligence (AI), generative AI, and machine learning have become buzzwords in the data management community. First, let's look at their definitions. Many approaches exist to define these disciplines and their relationships. In this article, I will use the definitions from reliable sources. However, they may not be the “only correct ones.”
Gartner stipulates, "Artificial intelligence (AI) applies advanced analysis and logic-based techniques, including machine learning, to interpret events, support and automate decisions, and take actions .”
?“Generative AI is a type of artificial intelligence technology that can produce various types of content, including text, imagery, audio, and?synthetic data ,” as TechTarget defines.
IBM defines machine learning (ML) as “a "ranch of?artificial intelligence (AI)?and computer science that focuses on the using data and algorithms to enable AI to imitate the way that humans learn, gradually improving its accuracy .”
?These three technologies are interrelated. First, it is essential to realize that GenAI and ML are types of artificial intelligence.
AI includes all forms of computational techniques that mimic human abilities.
Machine learning is a technique through which systems learn and make data-based decisions. ML is based on statistical methods and is used in other AI types, including GenAI.
GenAI is a specialized application of ML in which the algorithms do not just make decisions or classifications but produce new data instances that mimic the characteristics of the training data they have learned from. These models capture the essential patterns of the data and utilize this knowledge to generate synthetic data samples similar to the original dataset.
According to Roochir Purani , GenAI's key use cases and deliverables are “written content augmentation and creation, questions answering and discovering, language tone, summarization, simplification, classification of content for specific use cases, chatbot performance improvement, software coding, and synthetic data.”
Somehow, we have created a fetish for GenAI, hoping it will solve our challenges in managing data properly. Let me demonstrate a simple example: I want to visualize the relationship between AI, GenAI, and ML described above.
I checked various graphical options on Google and then asked ChatGPT to visualize this simple relation. Figure 1 compares the results of human and Gen AI work.
I am curious: which of the images appeals to you? I think you get my point about fetishizing GenAI capabilities.
So, let us be realistic about the role of AI and its sub-areas in data management. AI is a technology that can be used in different data management IT tools. AI is not a unique feature that can be used independently from other functionalities. AI empowers different data management capabilities and processes.
?So, let me share some takeaways:
Data management capabilities can be enforced by using AI technologies.
Several experts, Stephanie Paradis , Gretchen Burnham, CDMP , Dr. Gurpinder S. Dhillon , Aron Elston , Yves Meier , shared their expertise in this area. According to them, AI can be used to empower the following DM capabilities:
AI can automate stewardship activities and assist in content analysis and summarization.
AI can assist in documenting and aligning technical and business metadata, identifying dependencies between different metadata objects, and generating metadata.
AI can help create business glossaries, document and link logical and technical data models, build data classifications, and integrate and aggregate data from various sources.
AI can be used in data profiling, translating data quality requirements written in business language into data quality checks in technical languages, recommending DQ rules based on data source systems scans, and cleansing data.
AI and ML can optimize data processing and storage of significant data volumes using distributed computing, data compression, and predictive caching techniques. These technologies can be used for real-time processing to ingest, analyze, and act upon data arrival.
Roochir Purani predicts the following emerging GenAI use cases: “producing synthetic data helping augment scarce and incomplete data,” “enterprise applications proactively suggesting actions based on historical transaction data,” “encapsulating and modernization of legacy systems code,” “assistance role in managing complex projects,” “improving process workflows,” and “negotiating contracts and optimizing bids.”
AI requires governance.
According to Stephanie Paradis and Gretchen Burnham, CDMP , data governance should focus on defining business cases, curating data sets taken for training, controlling syntactic data sets for production, and controlling and controlling models’ governance.
Douglas R. Briggs has stressed that good governance for AI should “balance support for innovation with risk and impact,” take into account the concerns of a “broad spectrum of interested parties,” “provide clear and effective guidance for practitioners,” integrate with “existing organizational governance,” and “remain flexible and agile to adapt.”
Challenges in leveraging AI models exist.
Roochir Purani has mentioned the following challenges related to AI adoption: “AI projects are not always aligned with business strategy,” “failing to consider what tangible capabilities AI projects need,” “misunderstanding the probabilistic nature of AI,” and “articulating what business value they want AI to create, but not recalibrating organizational behavior in ways that will deliver value with the humans who interact with AI.”
Adopting AI impacts multiple business capabilities/operations.
These business operations include a business strategy, operating model, enterprise architecture, engineering and operations, and change management. It also impacts people, processes, and technology.
领英推荐
Future of AI in data management
Sonny Rivera believes, "We"need to stop micro-optimizing archaic processes and start reimagining them in a GenAI world – across the whole data-to-insight value chain.”?
" want to thank the leading experts in this area for their valuable input: Stephanie Paradis , Gretchen Burnham, CDMP , @Jenna Alden and &Alex Cruikshank from WestMonroe, Douglas R. Briggs , Dr. Gurpinder S. Dhillon , Aron Elston & Yves Meyer, Roochir Purani , Mike King , Sonny Rivera
Other Data Management Capabilities
Even when discussing fancy staff about AI, we must still be down to Earth and focus on applying and improving foundational data management (DM) capabilities.
Let me share with you some takeaways from the conference that relate to core DM areas of expertise.
Data quality
Master Data Management
The DAMA Dictionary defines master data as “The data that provides the context for business activity data in the form of common and abstract concepts related to this activity. It includes the details (definitions and identifiers) of internal and external objects involved in business transactions, such as customers, products, employees, vendors, and controlled domains (code values.)”
DAMA-DMBOK2 separates master and reference data. However, in practice, I’ve seen situations where professionals combine these two data types because distinguishing them is challenging.
In my practice, I also experienced some other challenges. Sometimes, the same data (e.g., contract) can be identified as master or transactional, depending on an organization’s business model. I also can’t understand the difference between master data and other data management. The data management techniques and required capabilities are the same as applied to any data type. Of course, the outcomes, like data architecture, may differ.
Let’s come back to the key conference’s takeaways:
Metadata management
My general observation is that the topic of metadata has been overlooked at the conference. This may happen because people don′t realize the role and importance of metadata. One reason is the complexity of the metadata concept. For example, unlike other data, metadata can be presented by a single element (e.g., a data owner) or a complex construct (e.g., data lineage).
In my workshop, I demonstrated that metadata management has two key goals: enable the data lifecycle and manage the metadata lifecycle. Most data management capabilities, like data governance, enterprise architecture, data quality, etc., produce, exchange, and consume three key types of metadata: business, technical, and operational. Various metadata constructs combine different types of metadata. For example, data lineage combines business and technical metadata, sometimes enriched by operational one. The data observability concept combines all three metadata types.
I included knowledge graphs discussed in several presentations a metadata-related topic.
Let me share a couple of takeaways on this topic:
·????? Knowledge graphs enable data integration via semantic layers.
Data visualization
Data visualization represents information in graphical format using charts, graphs, maps, and other visual tools.
LetLet'sscuss a couple of takeaways.
He stated that graphics have several goals: “to figure out what is going on” and “to explain to decision-makers what they need to know about reality.” Data visualization helps “see things that other people cannot” and ?provides “unique insights and exclusive understanding of what’s happening.”
Knowing the audience and normalizing data to acknowledge context are the key success factors in expressing information.
According to them, data visualization is a dynamic, human-centered, analytical, and discovery process that requires multiple methods. Three factors ensure good data visualization: data, design, and function.
Advanced data visualization is focused on complex data and includes interactive dashboards, 3D visualization, augmented or virtual reality, etc.
The benefits of advanced data visualization are improved operational efficiency, enhanced risk management, increased collaboration, reduced costs, and improved decision-making.
Data Analytics
According to Amazon , “Data analytics converts raw data into actionable insights. It includes a range of tools, technologies, and processes used to find trends and solve problems using data. Data analytics can shape business processes, improve decision-making, and foster business growth.”
Gartner recognizes four maturity levels of data analytics: descriptive, diagnostic, predictive, and prescriptive. While many companies are still on the first and maybe second maturity levels, the ultimate goal is to reach the upper levels. However, the predictive and prescriptive analytics are linked to the AI capabilities we discussed before.
Prashanth H Southekal, PhD, MBA shared his insights regarding predictive analytics in business. The key takeaways were:
I want to thank the leading experts in this area for their valuable input: C. Lwanga Yonke , Teri Hinds, CDMP , Gerard K. , Missy Clymer & Nathan Schmitt , @Christina Smith, Prakash Kewalramani , Donna Burbank , @Louis Crook & Mike Overholt (Ashley), Elisa Kendall , Dan Collier , Jeremy Debattista , Michael Scofield , @Kulev Rail & Dr. Deepak Singh & Dr. Arunkumar Ranganathan from @Infosys, Prashanth H Southekal, PhD, MBA .
Part 3 is the last part of this article. In conclusion, I strongly advocate for the invaluable experience of engaging in face-to-face interactions with leading data management experts at live conferences. These interactions are not only an excellent opportunity to acquire in-depth knowledge but also to stay abreast of the latest industry trends and innovations. Additionally, they offer a unique platform to gather fresh ideas that can be effectively implemented within your organization, driving growth and fostering innovation.
Marketing Operations Associate at Data Dynamics
6 个月Your breakdown of AI, GenAI, and ML is incredibly informative. It's refreshing to see a clear explanation of these interconnected technologies and their applications in data management.
Information Quality and Data Governance Consultant, Trainer, Advisor, Coach, Mentor
6 个月Valuable summary, Dr. Irina Steenbeek! Also, thank you for the mention.
Data Governance | Knowledge Graphs | Architect | Engineer
6 个月Thank you for the mention and another interesting article