Signs You Have a Data Management Problem
Merrill Albert
Enterprise Data Leader, Data Governance Officer, Data Thought Leader, Chief Data Officer, Fractional Governance, Data Evangelist, LinkedIn Top Data Governance Voice, creator of #CrimesAgainstData
People don’t often recognize that they have a data management problem.?They need data.?There’s data.?What could possibly be wrong??Well, just because you have data doesn’t mean it’s the right data.?If you haven’t properly managed it, why would you expect that it has to be right??There might be signs telling you that there’s a problem.?Follow the signs.
Looking at different disciplines of data management, I’ll share some stories indicating you might have a data management problem.
Data Governance
You either don’t know who’s leading data management, the structure is informal, or it’s at a low level.
When you have the wrong people involved in governing your data, you’re going to have unexpected results.?Maybe you have a subset of company functions, but have forgotten some functions, so decisions are negatively impacting those forgotten functions.?Maybe you have people running around trying to find the right person to talk to.?Maybe you have junior people who can’t make decisions without asking for approval.?Implementing data governance properly can make you efficient and effective.
There’s no dedicated data management leader.
With no one leading data management, it can’t be successful.?If someone’s supposed to lead data management but is already fully staffed, there won’t be time available to give to data management.?Time is needed to properly manage data and this can only happen with a dedicated leader.?You can’t save an hour at the end of the day or the week for “data management stuff”.
You hired technology or BI resources, not data resources.
Mistakenly thinking technology or BI is the same as data, you might have the wrong people leading data management.?Think of data as its own function sitting between business and technology.?If you have business people and technology people, it stands to reason that you also need data people.?Just because someone uses data doesn’t mean they know how to manage data – those are separate skills.
You implemented tools to solve all your data problems.
There are 3 things that act on data – people, process, and technology.?In that order.?If all you do is implement a tool, you’ve solved a technology problem and ignored the people and process aspects of the data problems.?Think of technology more as an accelerator.?It can speed things up and make them more efficient, but there are still people and processes that need to act on that tool to make it successful for you.
Education and communication are lacking in data management.
When people don’t understand data management and the business problems data can solve, the support and leadership will not be in place.?There might be someone leading data management, but if company leadership doesn’t understand it, other employees won’t see a reason to engage and it will be first on the chopping block.?Explaining the business problems that data management is eliminating will help it come alive and help build the support you need.
Data Architecture
Customer ID is not consistent between vendor applications, and is complicated by households and joint accounts.
We often have multiple vendor applications because we buy the best one for a problem, but then they don’t all communicate with each other.?They’re built as if they’re the only application in the world.?Then you also have customers who are related to each other and might live in the same home, so there are relationships to consider in the big picture of dealing with your customers.?If you haven’t reconciled all these databases, it makes it difficult, if not impossible, to have a complete picture of your customer.?Lacking that overall view can mean that your customer is not getting the products and attention necessary.
Separate applications spread out the customer data.
You might have conquered the different customer ID issue, but you still could have data about the customer spread out across multiple applications.?This could be personal data that was entered in one application and then updated later through a Customer Care application, but that Customer Care application doesn’t update the other application that has everything else that didn’t need updating.?Maybe a customer has a personal account and a business account with you, so understanding the whole picture can help you manage the accounts effectively.
Not knowing which customer address to use.
I’d say this is similar to the previous one where customer data is spread out across multiple databases, but it happens all the time with address.?I mean, ALL the time.?What is it about address??It seems that everyone asks for address, doesn’t care that there’s already an address on file, and doesn’t care that the address doesn’t match a prior address.?In a properly designed data architecture, this is easily solved.
No data models or documented data architecture.
This is not a nice-to-have.?This should be a requirement prior to anything going into production.?Lacking documentation, the business is unable to validate that IT correctly implemented the business rules.?Data models can look technical, but a logical data model represents how the business views the data and what the relationships are between the data.?While you don’t just hand someone a data model and walk away, with the right explanation and walkthrough, the business can understand and be confident that the data is represented correctly so the business will function correctly.
Key values are concatenated and need to be manually broken apart to work with the individual data elements.
Concatenating data can be risky when people don’t understand the rules and the implications.?Concatenating data is also often your guarantee that the data will change, meaning that the concatenated value needs to change.?When you’re using that concatenated value to identify the data, you’re now attempting to change an identifying field, which will likely lose tracking of all history.
Metadata
Not knowing who your customer is.
You think it’s obvious who your customer is until you try to define it.?That’s when you find you’re questioning if someone buying something anytime is what matters, or buying within a certain timeframe, or never buying but attending a sales pitch, or buying but returning everything, etc.?By not understanding those details, you don’t know who to treat as a customer and who to treat as a prospect.?You don’t know how many customers to use in your counts.?Your financials could get thrown off.
Not understanding all the different addresses you have.
A customer might have multiple addresses and you have to figure out which to use when.?Some addresses you might encounter are mailing, shipping, residential, business, seasonal (which season?), temporary (what timeframe?), etc.?Definitions need to be understood so that you’re using the data correctly.?Definitions might even vary within a single department.?There can also be a discrepancy with other departments, and if they need to be reconciled enterprise-wide, you will have difficulty.
Conflict between customer ID, person ID, and account ID.
You might choose to use all these terms or you might be forced to use them because of the vendor applications you use.?You need to understand what the differences are and they need to be communicated to everyone who uses them.?Reports and analytics will be delayed if you make the developer try to resolve the discrepancy to produce accurate content.
No centralized data dictionary, or definitions are hidden within vendor applications.
When people can’t easily look up what something means, they’re going to make an assumption that could be wrong.?When that assumption is based on a field name of 8 characters, maybe even in another language, it’s even more likely something’s going to go wrong.?Sometimes, definitions exist, but they’re buried in vendor applications that few people have access to.?When that happens, they might as well not exist.
Code values change unexpectedly.
领英推荐
It can be helpful to use codes to define data.?For instance, there are 3 types of products – type A, type B, and type C.?If type A suddenly goes away or new type D is created, everyone using product data has to agree to that.?When something as prominent as product, which is used by multiple departments, is impacted, the impacted departments have to be a part of deciding on the change, approving it, implementing it on a particular timeline, and using it properly.?Finding surprises on reports or finding something no longer on a dropdown menu is not the approach to take.
Data Quality
Inconsistent data entry.
People can make spelling mistakes, such as an unusual city name.?People can enter a bogus value because the application requires a value, such as entering all 9s for a social security number.?That point of data entry is the perfect place to check data quality.?You need to stop data quality issues at the point of entry so you don’t exacerbate the problem by working with poor quality data.
Not recognizing abbreviations.
People sometimes use abbreviations and sometimes don’t, which can create confusion when you’re trying to match data.?Something as simple as “Street” and “St” should be recognized as the same.?Treating them differently could result in multiple versions and in the address example, could introduce ineffective routing of deliveries.?You need to make a decision whether you’re going to store abbreviations or not.?For addresses, USPS makes it easy by offering address standardization.
No regular data quality reporting.
While I advocate checking data quality at point of entry, the reality is that something’s going to get missed.?By regularly checking data quality, you’ll be able to identify these problems before they go further.?Assuming you have a level of data quality without checking could lead to questionable analytics and bad decisions.
Reactive waiting to hear about problems rather than proactive in preventing problems.
If you don’t bother to check for data quality, you risk your customers finding the data quality problems.?When that happens, you risk your reputation and a loss of customers.?Most customers can ignore some small issues from time to time, but make enough issues and they’ll notice.?Depending on the type of business, such as financial services, it can be a bigger risk to your reputation when customers wonder if these little issues are an indication of how well you’re managing their money.
Not recognizing who is responsible for data quality, relying on tribal knowledge, or using a gut feel approach.
When you don’t know who is responsible for finding and solving data quality problems, it is inevitable that no one is.?If you assume it must be right but don’t check, you run the risk of being surprised later.?Working with data quality needs a disciplined approach using the right people to fix the problems once and for all at the root cause.?Applying some spackle to cover up the problem just means that it will continue to happen again and you’ll continue applying spackle on a regular basis.
Data Privacy
Addressed security of databases, not privacy of data.
Many people hear the term “data privacy” and immediately think they heard “data security”.?Data security is more about preventing outside sources from getting to your data, perhaps in the form of a data breach.?Data privacy is more about protecting the data once you have it on the inside.?Just because people are employees doesn’t mean they can access all the data.?For instance, social security number is data that needs to be kept private.
Addressed physical records and documents only.
Days are long gone when it was all about paper.?Keeping paper private is completely different from keeping data private.?You need to understand the privacy restrictions on all your data elements.?Some might be available for public use, but others are confidential or sensitive.?If you have a database with 2 private data elements, you have to think through how you’re going to protect those 2.
Not keeping data privacy current with new regulations (e.g., GDPR, CCPA).
Regulations are constantly changing.?In the US, states have privacy bills making their way through the legislative process.?Not all states are doing it the same way, so you have to be aware of all the differences and what applies to you.?Some regulations are written based on where your offices are and some are written based on where your customers live.
Not recognizing the impact on privacy of customers moving internationally.
Some regulations apply to where your customers live.?If your customers are local and someone moves internationally but wants to continue doing business with you, you now have to check if any international privacy laws apply to you.?Not staying current on regulations is not a legal excuse for violating them.
Assuming the company purchases industry data without checking the contract.
It’s not unusual for companies to use industry data, but the contract needs to be reviewed to determine what you can and can’t do with the data.?While people often talk about buying data, the reality is that it’s usually leased.?It’s not your data and you have to comply with any privacy rules dictated by the company you leased it from.?You typically have use of it for a period of time and then you have to stop using it, perhaps even proving you deleted it from your databases.?You might also have to ask permission before sharing the data with contractors working with you.
Data Retention and Destruction
Addressed data retention without thinking about data destruction.
Many people go straight to data retention and think how long they’re required to keep records.?That doesn’t necessarily mean though that keeping it longer is even better.?Your lawyers will often advise that once the retention period has ended, it’s time to get rid of that data.?Anything you kept could be called into court.
Not understanding data retention rules.
You need to have rules around retaining data, such as how long to retain it and where.?You need to know where your data is, such as operational databases, archive databases, Excel spreadsheets, and with contractors.?You have to understand that not all data is retained in the same timeframe.?You need to not just have the rules documented, but you also have to be following them.
Not understanding data destruction rules.
You need to have rules around destroying data, such as when to destroy and how to destroy.?You need to know where your data is, such as operational databases, archive databases, Excel spreadsheets, and with contractors.?You have to understand that not all data is destroyed in the same timeframe.?You need to understand how to handle data you obtained from vendors and what to do if destroying the data also destroys referential integrity.?You need to not just have the rules documented, but you also have to be following them.
Not searching all locations of data.
Data has a habit of moving around.?You might initially load it into a single database, but before you know it, it’s in multiple databases, in multiple Excel spreadsheets, and has been shared to external parties.?If you have to destroy the data, you have to identify all those places and follow through.
Assuming the same rules apply to all countries.
Data retention and destruction rules can vary by country.?This would apply to a global company.?You have to think if you sent data to another country or if it got stored in another country.
Enterprise Data Management Expert | Data Governance | Metadata Management | Consultant | Doctoral Faculty Mentor | Curriculum Development | Ph.D.
7 个月I read each of these posts as they came out, and they were all "on target". Reading them here together conveys an even stronger message - effective #datamanagement needs the right people, with the real skills and competencies, and sufficient authority to implement the best practices. Not relying on technology to "solve" the issues; ensuring the organization has a formal data strategy developed by data professionals.
Marketing Operations Associate at Data Dynamics
8 个月I couldn't agree more. Data management isn't just about technology; it's about having the right expertise to navigate its intricacies. Investing in dedicated data resources is paramount for success. #DataSkills #DataProfessionals #DataStrategy