Years On - Still a Data Quality Focus
DQ Finding Direction

Years On - Still a Data Quality Focus

This is not a posting for Alation (my current employer) or any data quality vendor but a personal reflection from a lifetime in data quality.? In other words, these are thoughts based on a lifetime of data, and approximately 12,000 bottles of Advil.

Today marks the One Year Anniversary of Alation rolling out the Open Data Quality Initiative, and nearly 29 years since I was introduced to Data Quality tools. I wanted to mark that with a personal reflection.

First, I wanted to call out a few pioneers in the data quality space most notably Larry English, Tom Redman, and Danette McGilvray. These three did a ton of evangelism two decades ago, and really helped companies move forward. Larry did much to get people excited about data quality but Danette is someone I continue to trust and have used her book to get companies to really understand an approach to data quality that works.? It is important to note that I have about 30 different customers who bought Executing Data Quality Projects and have provided me with appreciation for the content.? We have been able to enhance it lately with a lean approach integrating data quality into catalog capabilities to get the best results with the least amount of effort. Without their industry influence the knowledge of what data quality really is would not be where it is today.

We are currently in the third generation of Data Quality. The first was prior to Y2K and the use of some tools on the mainframe to get data right. It was easier back then with fewer systems and some good data quality capability. The second was in the early 2000's when the first DQ programs and focus started, on occasion these programs became too big and lacked efficiency.? DQ Teams were able to make a huge difference but only when they had good focus and leadership. The third is now.? DQ has been reborn due to the importance of correct data for AI, Digital Transformation, and many other initiatives. I normally refrain from talking about data explosions, as data volumes have been growing for years. I have found it vital to add a focus on data quality as more data sets translate to the need for an increased focus on efficiency to understand when data is complete, conformed, accurate, consistent, free of duplicates, and timely.?

Second, I wanted to share in remembrance the influence of Tom Gebow who I worked with at Informatica, who died years ago of cancer but had a commanding influence on my approach to data quality. His notion that data quality is an art and not a science has stuck with me many years later. The idea that work goes into good data, and you need to think, work hard, and focus will never leave me. Tom was without question the smartest person I have ever worked with. Tom had a clear view of identifying, fixing, and following up to make sure data quality problems never return.? We must not forget that.

Without these industry leaders I have learned from over the last 30+ years, data would not be where it is today.

Third, I wanted to share an appreciation for the data quality vendors who I have spent time with over the last year who are participants in the ODQI including: Acceldata, Anamolo, Bigeye, Datactics, Experian, First Eigen 'Data Buck', Lightup, and Soda.IO. These are partners of Alation who signed on for this program, have invested in integrating to the Alation Catalog and have a divergent set of ideas of what data quality and data observability tools are but I am looking forward to what they do in the future.

Fourth, data observability vs. data quality. In my role I spend a lot of time with DQ vendors, and educating our customers on data quality (over 100 different companies I have spoken with on DQ in the last year). The market is a bit messy at this point as the new Data Observability term has confused things in the DQ space. Gartner continues to talk about DQ Solutions, but Data Observability is its own category.? Gartner has defined this with five pillars (1) Observing Data; (2) Observing Data Pipelines; (3) Observing Data Infrastructure; (4) Observing Data Users; (5) Observing Cost & Financial Impacts.

It is important to realize that the idea of DQ inside of Observing Data is not enough. That is, while it is great to do advanced profiling on huge amounts of data more is needed, Gartner's view of observability to: (1) See what went wrong; (2) Share why it happened; (3) Hypothesize possible impacts; and (4) Make recommendations to fix. That simply isn't enough. We need to fix it as dynamically as possible...and fix it before the data gets used in any way/shape/form.

To that end, I personally believe the focus of data quality inside of Data Observability tools need to not just see, but do. To that end there are data observability firms that I speak with get very motivated to take this on, and deliver this greater service. That is inspirational for me.

Alation and the Open Data Quality Initiative (ODQI) is often discussed as best of breed. I would share that it is something much more. I had a former employee at Honeywell that liked to talk about 'Horses for Courses' and that phrase rings true with the Data Quality/Data Observability space right now. That is to use the right horse to run the right race...No one does it all. Firms need to find the right vendor that meets their needs, understands what they need, and executes to success. Those who may sell well, don't necessarily have the wheels to get it done. Data Quality is a very powerful set of capability, having great data you can trust, share what data you have issues with, and having dynamic capabilities to drive out trusted data makes data shine.

I am really excited about data, what data quality can do, and what is next with AI...but the bottom line is AI won't fix all your data problems, be smart and get it done.? This is why I work at a place that has become in my eyes “The Data Governance Company”.

Great article Jim Barker - happy we (Anomalo) are partnered with such a great team at Alation. #odqi

Happy anniversary to #ODQI! Great article Jim Barker

回复
Gene Arnold

Product Architect - I'm privileged to be able to listen to our customers and help build the tools they need!

1 年

Great post Jim and thank you for your commitment to DQ! (and the link to that book!)

Gabriele Christensen

Senior Data Governance Analyst at NRG

1 年

Very insightful - and with great context. Thank you for sharing your thoughts!

Gordon Hamilton

Data Quality Evangelist for 20 years, steadily improving my ability to communicate the importance of DQ for Cost Reduction & Data Monetization.

1 年

Inspirational Jim Barker! Tom Redman, Larry P. English, Danette McGilvray, and Tom Gebow lit the Data Quality path for all of us to follow. "A data quality hero is something to be" to paraphrase John Lennon. We are looking forward to your presentation to DAMA-Vancouver BC Chapter this Friday on "Data Intelligence Delivers Enterprise Data Value". I expect you will get a few questions on DQ. ??

要查看或添加评论,请登录

Jim Barker的更多文章

  • DGIQ - Day 3 Insights

    DGIQ - Day 3 Insights

    As a final day wrap-up I wanted to share three things I took away from day 3 of DGIQ/AIGov. Note, I don't do the whole…

    1 条评论
  • DGIQ - Day 2 Perspectives

    DGIQ - Day 2 Perspectives

    Day 2 at DGIQ/AIGov is complete and from my point of view there are three things I will take from it. Point 1 - DGPO…

    1 条评论
  • DGIQ/AI Gov Day 1 Lessons Learned

    DGIQ/AI Gov Day 1 Lessons Learned

    Hello all, Just quick eye catcher in the pictures above of the Capital in DC and the State Capital in Wisconsin from my…

    1 条评论
  • Data Stewardship and AI - A Chat with David Plotkin

    Data Stewardship and AI - A Chat with David Plotkin

    Hello all, in the latest session in the "Barking Mad at Data Series" I am speaking with David Plotkin, expert…

    2 条评论
  • Barking Mad at Data - Data Privacy

    Barking Mad at Data - Data Privacy

    Hello all, For the next round of "Barking Mad at Data" I invited Deanna Briggs , a former colleague of mine to talk…

  • Data Quality - Barking Mad at Data

    Data Quality - Barking Mad at Data

    Hello all, I had the pleasure of sitting down with Danette McGilvray, author of Executing Data Quality Projects and…

    2 条评论
  • CDO Magazine Summit - A Recollection

    CDO Magazine Summit - A Recollection

    Last week I had the pleasure of attending the CDO Magazine Summit in Cincinnati which had, surprise/surprise, a huge…

  • Being Successful in Data Governance with People

    Being Successful in Data Governance with People

    Data Governance, Data Programs, and People Those of you who know me, know that after spending three-plus decades…

    5 条评论
  • Barking Mad at Data - an Introduction

    Barking Mad at Data - an Introduction

    Hello all, I am normally more interested in writing blogs or white papers to share technical ideas than doing meme's…

    5 条评论
  • Data Governance - Music to my Ears

    Data Governance - Music to my Ears

    What does AC/DC, Jimmy Buffet, Johnny Cash, Devo, and Metallica have in common - why data governance of course. We just…

    2 条评论

社区洞察

其他会员也浏览了