Data Governance? Put a COAT on your Dirty Data First
Jennifer Moore
Creating connections in a disconnected world by meeting people where they are. Mental Health Advocate | Speaker | Curious | Superconnector
This article was originally published on the Minerva blog.
Recently, I met with the Classification Guru Susan Walsh to discuss the importance of using clean data on the path to Data Governance.
Susan got into the world of data by accident. Having spent over 10 years in various sales positions and entrepreneurship, Susan changed direction towards data quality assurance and discovered a new passion in her professional life.
In this episode, Susan shares how an organization can detect if there is an issue with the purity of their data and provides examples of “dirty data”, their symptoms, and root causes.
She introduces the COAT concept and how to limit the challenge of dirty data and act as the first step on your path to data governance.
In this blog, I’ve captured a few pointers from our conversation.
You can also watch the complete Dirty Data episode on YouTube.
Every organization struggles with dirty data
Regardless of what industry you work in, they all have very similar problems, dirty data. Within supply chain and manufacturing, missing product codes are a huge issue. As an example, you could have typos, zeros replaced with O’s or wrong descriptions assigned to product records, missing information, or even duplicate information.
Regardless of discipline and industry, Susan has seen firsthand that these are some of the problems manufacturing companies and their suppliers all over the world are struggling with.
One way the problem starts if someone does not find what they were looking for in a system and then creates a whole new customer or supplier record. That leads to duplicate information, where problems are bound to happen.
How do you know if you have a data problem?
Usually, if you are running reports and information or numbers do not add up, or you cannot find information or are missing information, you have a data problem.
Ok, problem identified. Then what do I do? You want to clean your data.
This is why Susan sometimes refers to the data cleanser as the trash lady of the data world. Just like you need to get your trash collected each week, you need to take the trash out of your data and clean it up. Depending on the size of your company and complexity of your data, you may need to clean your data more often. Ultimately, if you do not clean and maintain your data, your data problem will multiply.
Put a COAT on your data
Susan believes that most data problems could be solved at the point of entry. She created the acronym, COAT, to make solving data problems more relatable and accessible to as many who work with data as possible. COAT stands for.
- (C)onsistent: Be consistent with your classification standards, like liter or ltr., oz or ounce, names, phone numbers, etc.
- (O)rganized: just like that great top you know you have in your closet, but cannot find because your closet is disorganized, you are sure the data you are looking for exists. If you had organized your closet by tops, skirts, trousers, dresses etc., you could just open the closet and find that top. It is the same with data. It needs to be classified with relevant, repeatable categories like country, business unit, department, warehouse, supplier, material, unit etc.
- (A)ccurate: if your data is not accurate, it is not useful. Now, accurate can mean different things to different people. 100% accuracy is the goal for some, while others are aiming for just enough accuracy to make things work.
- (T)rustworthy: This means you can rely on your data to help you make decisions. For instance, your inventory unit numbers show you have this much stock, you are producing X amount, and you need to have Y on hand after production to fulfill orders by a certain date. With trustworthy data you can accurately calculate how much additional material or parts you need, by when, to produce the end number of products on time.
Additional Points of Consideration
Product Lifecycle Management (PLM) includes enterprise-wide Master Data. Master Data Management (MDM) requires Data Governance. Data governance (DG) refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise and should extend through PLM.
Nowadays, Data Governance is as widespread a topic as Digital Transformation. But Data Governance and governance culture don’t happen on their own or overnight. A key step before moving to a master-data based culture is to clean up existing data.
Keep the Conversation Going
When Susan started out advocating for data cleansing, nobody else was focusing on it. It was seen as an add-on, a service on top of something else. It was never seen as a service within its own rights.
That is why she makes a lot of content related to dirty data, data cleansing, and the concept of COAT, It helps to attract business, of course, but she genuinely believes that broader awareness of cleaning dirty data brings cost saving opportunities to manufacturing companies and makes their lives easier.
What are your takeaways? What questions do you have about data cleansing and moving to Data Governance? Share them with the community and keep the conversation going.
Want to share your story? Drop me a line or send me an email at [email protected]. I’d love to hear from you.
The Classification Guru ★ Fixer of dirty data ★ Improving profit, the bottom line & efficiencies ★ Samification ★ Spend data classification, normalisation, & taxonomies ★ Creator of COAT ★ TEDx ★ Author ★ Speaker ★
4 年How have I only just seen this?? Thanks Jennifer!