Data Nugget March 2023

Data Nugget March 2023

31 March, 2023

DAMA Norway?is back?with a fresh episode of?our monthly dose of the data nuggets. So before you go on a vacation, don't forget to check out?our March newsletter to stay updated with the interesting news from the #DataManagement world.?This?month, we have brought a mix of different topics surrounding data with a focus on #DataMesh.?

First and foremost, we have a #BookReview that?offers help in the discovery of?data. Second, we have a read about DMBOK framework. Third, we have?some reflections around Data Mesh one year after its release. Next, we?present to you the six principles for value creation through data from the series of?data?podcasts. And lastly, we have a nugget about understanding the Data Mesh in layman's terms.

So grab a chair and enjoy reading!

Let's grow Data Nugget together. Forward it to a friend. They can sign up here to get a fresh version of Data Nugget on the last day of every month.

Ingen alternativ tekst tilgjengelig for dette bildet

The Enterprise Data Catalog: "The"?book on discovering data

Nugget by Winfried Adalbert Etzel .

“The Enterprise Data Catalog” by Ole Olesen-Bagneux is a book not just about Data Catalogs. It addresses fundamental, structural issues in data that are directly or indirectly connected to the way we find data.


The author delivers a profound combination of Information and Library Science with Data Management and IT. A combination that provides a legitimation for:

  • holistic view on data and information across fields of expertise. To search for and find data, information and ultimately, knowledge is essential no matter structure, format, storage, medium, etc.
  • including knowledge from other fields into data. Data Management (and IT) are young professions that are still emerging. So, a relation to an over 1,000-year old professions like Library Science should be embraced. What better starting point to learn to organize, search, and ultimately find data.

How can you use the book? So the book has a broad application: from structuring, organizing and democratizing to envisioning the future of search. You can use the book:

  1. ?as a first step when planning for #DataCatalog implementation, for example: What are the fundamental considerations when planning for a Data Catalog? What can you use Data Cataloging for? What are the pre-requirements before thinking of a functional Data Catalog for your company? What to consider when gathering and designing your requirements??
  2. To understand search: Why do we search and what are the mechanisms of search? How do you get from searching to finding, and what does data discovery entail? How do we search for information internally in a company? What is the difference between 'search for'?and 'search in'?data??
  3. To argue for the value of a Data Catalog with your stakeholders. Who are your central stakeholders? How to consider your Data Catalog as a social network? How does the Data Catalog fit into your system landscape??
  4. To discuss the data lifecycle. How can a Data Catalog support #DataLifecycle management? What is the value of the time-factor? How are lifecycles connected??
  5. As a looking glass into the future of search. Will the Data Catalog in the long term become the internal company search engine? How can Data Catalogs integrate into the workday as a basic tool for all employees?

Data Catalog

Let’s get a bit more into details on some of the use-cases this book provides (and these are only a few, picked rather subjectively), starting with the Data Catalog itself.

“At its core, a Data Catalog is an organized inventory of the data in your company. That’s it!”

We are met with this short and precise definition quite early in the book. And it is quite descriptive of the style of the author to be clear in his message and at the same time as he shows an academic and structured approach.

With this statement about the use for a Data Catalog in mind, there are three key features of the Data Catalog, which are:

  1. It creates an overview of the data in your IT landscape,
  2. It organizes your data, and
  3. It allows you to search your data

And this is what the author uses as the basis to discuss all the topics mentioned above.


One of those being “Search”. This might be one of my biggest ‘eureka’ moments in the book because it became so clear that there is a difference between ‘search for data’ and ‘search in data’. The author argues through the entire value stream of search, from question, to search, to find, to discover, while at the same time, arguing for a clear and conscious differentiation between ‘searching for data’ and ‘searching in data’. You need to apply a Librarian mindset to find relevant data sources to search in.


Another fundamental element that is recurring throughout the book is the importance of #Metadata. The DAMA DMBOK (Data Management Body of Knowledge) does not mention Data Catalogs at all but uses the term ‘Metadata Repository’. Why is this important? Because a considerable part of the value proposition for Data Catalogs is linked to Metadata. Data Catalogs reveal the companies’ data on a metadata level; at a level that makes data searchable, but also ensures accessibility and security.

Data Mesh

At times where we need our data organizations to scale and be operational within many different domains, this book provides a key element towards federated and decentralized architecture and organizational design approaches: a unified way of searching across.

Data Mesh has gained traction during the last years and a domain-driven design to data, with federated responsibility to those domains might be the most discussed topic in the data world. The author has accounted for Data Mesh as well as ‘Data Management at Scale’. He discusses the different approaches to Data Catalogs, including #KnowledgeGraph driven approaches.

Data Lifecycle

Chapter 7 of the book made a great impression on me: ‘The importance of Data Lifecycle Management.’ First, the author acknowledges the importance of data lifecycles and their management. This topic has been under-communicated in our business for far too long, and I would love to see others follow the author’s example in putting this on our agenda. Second, the author shows the complexity of the factor time in data management, the interaction between different cycles as well as the shortcomings in the early stages of data lifecycles. Thank you, Ole, for putting this back on our agenda.

Enterprise Search Engine

Chapter 8 gives the book a perspective. The author creates a vision of what Data Catalogs can become and how we might search in the future. It does feel like the entire book is building up to this vision, the vision of Data Catalogs as company search engines. A chapter that can be read individually, but really unfolds its value if you have read the entire book.


I was glad to see that the author discussed ethical implications of Data Catalogs in the afterword. There are important considerations to make once you provide capabilities to uncover all knowledge in a company. I would like to see more on this topic, maybe also some of the ethical thought at the beginning of the book.

My Recommendation

With the complexity that this book provides and the many ways to use it, I consider “The Enterprise Data Catalog” as a fundamental work towards a better understanding of the importance of search, accessibility, and structure of data. The title does not do much justice with the book; it is so much more than a book on Data Catalogs and, in my opinion, it belongs to the basic reading list for any data management professional. I would review the book as ‘Highly recommended.’

Ingen alternativ tekst tilgjengelig for dette bildet

DMBOK: Best before Data Mesh?

Nugget by Kjetil Eritzland . Citations?from Zhamak Dehghani s’ book Data Mesh and DMBOK.?

The word on the street is that the #DAMA framework is irrelevant in this age of data engineers working in an agile fashion in autonomous domains. They say it is too rigid and represents shackles you need to free yourself of.

I have never understood exactly what the alternative is. Nor why they have this perception that the tenets of DMBOK no longer apply in the new, meshy world.

There may be some misconceptions about what #DMBOK is. It is not a cookbook. As written at the end of the Introduction:

“Most enterprises do not perform all of the activities described […]?However, understanding the wider context of data management will enable organizations to make better decisions.”

I don’t think better understanding has ever hurt anyone.

In Chapter 1 of DMBOK, DAMA seeks to explain why we need a framework: “There is a lot to keep track of, which is why it helps to have a framework to understand the data management comprehensively and see relationships between its component pieces”. And I dare to say that the disciplines that constitute data management is still relevant in a data mesh context. We may not do things the same way that we did in the old data warehouses, but a data mesh is also not a free-for-all.

#DataGovernance is an area that many see as causing too much friction in the development process. It is easy to lose track of what data governance is with all the tools out there that call themselves data governance tools. If we turn to DMBOK, it has a very succinct definition of data governance in Chapter 1.3:

“Just as an auditor controls financial processes but does not actually execute financial management, data governance ensures data is properly managed without directly executing data management.”

And Dehghani’s #FederatedComputationalGovernance could very well be the way we ensure that the right things are done.

If we turn to Zhamak Dehghani’s book Data Mesh and look at her requirements for #DataAsAProduct, we see that there are quite a few familiar elements there. Take the usability requirement. Dehghani writes

“A data user needs to understand this meaning: what kind of entities the data product encapsulates, what the relationships among the entities are, and their adjacent #DataProducts.”

I can think of no better way than a good old conceptual data modelling session to flesh out that information.

Another requirement is for a data product to be interoperable, and one of the things we need to do to accomplish this is, again according to Dehghani,

“[…] in order to get interoperability and linkage between data across domains, there are data entities in each domain that need to be modelled in a consistent fashion across all domains. Such entities are called polysemes. Standardizing how polysemes are modelled, identified, and mapped across domains is a global governance function.”

For us old-timers, “Standardizing how polysemes are modelled, identified, and mapped across domains” is better know as master and reference data management.

Some things are definitely new in data mesh and can be considered great improvements upon how we used to do things before. But there is no need to throw all the knowledge we have gathered throughout the years overboard. When embarking on a data mesh journey, keeping DMBOK in your back pocket may be what saves you from ending up with a data mess.

Ingen alternativ tekst tilgjengelig for dette bildet

One year after Data Mesh – some reflections

Nugget by Nora Skjelstad . Written from Zhamak Dehghani s’ book Data Mesh.

March 2023 marks one year since the release of Zhamak Deghani’s book on Data Mesh. Even though the concept itself was first defined and published in an article in 2019, the number of hits on “data mesh” on Google increased drastically after the release of the book. A number of businesses jumped on the trend, adding this new and modern approach to data architecture and analytics to their strategies and starting to wonder about “What is a data product?” or “How do we fit consolidated domains into our business structure?”. But what is really the status quo on Data Mesh in businesses, one year after it hit the fan?

The rise and use of machine learning and/or artificial intelligence made a lot of business aware of the need for good and usable data, and therefore, also data management. It is safe to say that Data Mesh filled a void regarding management of #AnalyticalData, and one could argue that traditional data management doesn’t quite apply for data that is to be used in advanced analytics and machine learning. However, issues arise when Data Mesh is the only focus area in a business’ data management strategy. As mentioned, Data Mesh primarily addresses analytical data, which leaves the operational, traditional (and perhaps more stressful) data management hanging.

To implement the concept of Data Mesh in an organization requires a certain data management maturity and a base to build #DomainDrivenDesign on top. As a data professional, I would argue that a lot of businesses are trying to run before they can even walk. As with any data management initiative, Data Mesh is an evolution and not a revolution. Both implementing Data Mesh and creating a steady data management organization takes a lot of time, effort, resources and change management consultants. Therefore, people are a bit sceptic whenever a business claims to already having implemented a fully functioning Data Mesh only 4 years after the concept was first introduced.

So, let’s say you are working in a business that has followed the DAMA-framework that could confidently claim they have a stable and well-functioning data management organization and are about to start their journey with Data Mesh, one pit that is easy to fall into is to interpret the book and its theory literally. They stress creating data products that are true to its description in the book and the attributes that the book says ‘data as a product’ should have. They stress changing their matrix organization to create consolidated domains and to make sure there is no ‘old fashioned’ #DataWarehouse technology to be used by those domains – only a single self-serve platform. It is important to keep in mind that no organizations are alike, and there is no one-size-fits-all. So don’t worry if you make some adaptations to the concept of Data Mesh. That’s what you should do.

To sum up, Data Mesh is just what the data management industry needs for analytical data. But some organizations need to focus on traditional data management first before they start implementing Data Mesh. Also, have patience; both data management and Data Mesh are initiatives that take a lot of time. Make sure that you fit the theory to your organizations; don’t interpret everything literally.

Ingen alternativ tekst tilgjengelig for dette bildet

#MetaDAMA?2#8:?The 6 principles for value creation through data

Nugget?by? Winfried Adalbert Etzel .

It is not just about the use of data, but the use of data in a cross-functional setting.

What a fantastic conversation with Nina Walberg . Nina has been with Oda since 2019 and has a background in Optimization and SCM from NTNU.

Oda strives?to create a society where people have more space for life, making?life as hassle-free as possible. And to achieve this with help from data, Oda has created its 6 principles for how they create value with data.

Here are my key takeaways:

Business model and use of data

  • Oda's?business model allows for a better and more cautious way of thinking sustainability. The quantum of products can be tailored to the actual need.
  • Climate recipe: Oda provides data on the climate footprint to its customers?when ordering products.
  • Use data to provide not just inspiration to your customers, but also help them to create a complete basket of groceries for the week?to avoid any additional trips to the super market.
  • The use of data needs to be combined with all functions, also business functions. All should be part of the development process.

Six Principles

  • Six principles recently updated, but originally created by the entire team in 2019.
  • Oda believes in autonomous teams and that trust and responsibility is given to these team.

1.Domain knowledge and discipline expertise

  • 70% of Data and Insight people are embedded in cross-functional teams.
  • Data is connected across the company and across domains. So you can not work exclusively in an embedded model. You need some central functionality.
  • Data maturity differs between domains. So really the embedded model depends on the circumstances and how Oda applies that.
  • Data Mesh has been an inspiration.

2.Distributed data ownership, shared data governance

  • Processes and parts of the product for Oda are developed by the domain teams. To federate the ownership is a natural move.
  • Domain boundaries need to be explicit. "Every model that we have is tagged with a team."
  • Ownership of certain products is harder or sometimes not right to distribute. It can either be core products with no natural domain team to own that many are dependent on?or the core data platform itself.
  • Customer data first, but without proper product data you can’t live up to that.

3.Data as a product

  • Make sure you do not just deliver a product, but that it meets the customer needs, not just the customer demand.
  • Consistency matters, structure matters, naming matters when it comes to data products

4.Enablement over handovers

  • Enabling others to do what they should be able to do themselves.
  • Oda has established a segmentation model for five levels of self-service with different expectations to different user groups.
  • Self-service needs to be tailored to different roles, maturity, and needs of the internal users.
  • Data University, Data Hours and many other initiatives help to create a learning culture and improve #DataLiteracy.

5.Impact through exploration and experimentation

  • It is important to test and see how a solution actually provides value to the expectations.
  • This provides insights and information you can act on.

6.Proactive attitude towards privacy and data ethics

  • Data ethics needs to be incorporated and can’t be an afterthought.
  • Company values can and should be directly linked to the work with ethics.

This episode was recorded in September 2022. Click?here?if you what to know more.

You can listen to the podcast? here ?or on any of the common streaming services (Apple Podcast, Google Podcast, Spotify, etc.)

Ingen alternativ tekst tilgjengelig for dette bildet

Understanding the Data Mesh

Nugget by Isa Oxenaar .?Credits?to? Barr Moses , CEO and Co-founder, Monte Carlo .

The article “Decoding the Data Mesh” gives a solid understanding of the core issues that data mesh tackles. Moses had a chat with the founder of Data Mesh, Zhamak Dehghani Deghani, in which the founder dispelled a couple of misunderstandings about Data Mesh.

Is data mesh a technical solution?

No. Data Mesh is defined as a “#SocioTechnical shift—a new approach in how we collect, manage, and share analytical data.”

A #DataArchitecture is a Data Mesh if it includes the following basic elements:

  1. Distributing ownership of data from a centralized team to a team at a business domain.
  2. The teams in these domains will have accountability over the data and treat data as a product.
  3. The teams will be empowered with a #SelfServeDataInfrastructure.
  4. New problems that might arise will be addressed with a model of federated data governance.?

Is data mesh another word for data virtualization?

No, when you apply the #DataVirtualization approach to a data mesh, “you’re trying to expose a database that has been optimized for a transactional purpose for analytical purposes”.

Does each data product team manage their own separate data stores?

No. When applying Data Mesh, a data product developer wants to have autonomy over the data, but that does not require a separate storage layer. Most data mesh will have a single cloud structure and one storage layer with different autonomous schema.

Is a self-serve data platform the same thing as a decentralized data mesh?

No, since most current self-serve data platforms are built to serve a centralized data team.

Is the data mesh right for all data teams?

Organizations that face the problem of scaling data reliability are the organizations where adopting the data mesh makes the most sense”.?

Does one person on your team “own” the data mesh?

Data mesh takes cultural buy-in across the organization. According to Zhamak, data mesh probably works best with top-down support when domains also take ownership of their data.

Does the data mesh cause friction between data engineers and data analysts?

Because of the decentralized nature of data mesh, it is often so that data mesh leads to reconciliation in areas where there has usually been friction.

Read the full article here.

Thank you for reading this edition of Data Nugget. We hope you liked it.

Data Nugget was delivered with a vision, zeal and courage from the editors and the collaborators.?

You can visit our website here, or write us at [email protected].

I would love to hear your feedback and ideas.

Copyright ? 2021 Data Management Association Chapter Norway, All rights reserved.

Our mailing address is:

Den Norske Dataforening, Rebel, Universitetsgata 2, 0164 Oslo


Data Management Association Norway (DAMA)的更多文章

