Bridge to nowhere: Build or buy a Data Quality solution
Several months back, I wrote a?blog?about considering the implications of a cloud provider data quality (DQ) solution. This blog received great feedback, but it presumed that organizations were seeking to buy data quality solutions and not just build their own. Some readers asked, “what about just building my own solution? I have developers and a budget.” Some had even seen vendor DQ solutions and said, “well, I can build that myself.”?
Reflecting on that feedback, I would like to share some of the implications of building your own DQ solution vs. a Commercial-off-the-Shelf (COTS) solution. I hope to explore both options in detail.
To refresh our last discussion, we focused on a couple of key themes. Those themes considered tech resource quality and organizational complexity. Underlying those themes, the blog quantified the fastest “DQ Time to market” as your strategy when making “buy” decisions. Our research proved that you should consider a DQ offering from a data intelligence company instead of a point DQ offering.?
I believe the same concept could be applied to “build vs. buy” as well. As a leader, who cares for Data Quality and Observability, you want to prioritize DQ time-to-market when considering your technology. And how it will scale to ensure the least amount of risk associated with bad data.
1. Technical resource quality – Can you afford to build and maintain?
Most companies today are tech companies. This is seen in the words like Fintech, Medtech, and even ‘true tech.’ Further, companies are now starting to call themselves data companies. These trends correspond to the proven return on investing in technology and monetizing data. But while companies seek to innovate in the scope of their industry, leveraging their data, not every investment will yield the same returns.
Product organizations would find great value in digitally transforming a key aspect of their front office, tying the data to a centralized store, and leveraging analytics to beat the competition. But while they build on those features, they would not likely “reinvent the wheel” with applications like Outlook, JIRA, or Salesforce. Industries like Financial Services, Healthcare, or even Energy would either leverage those applications or look to their own industry competitors.
The main driver is the sunk cost. An organization can build an email system, but would look at all the additional resources necessary to re-create something that Outlook or Gmail already accomplishes. As history shows, some companies tried these decades ago, but the industry-driven application prevailed.
Consider that analogy alongside the build-DQ financial implications chart below. The below chart shows some real additional costs for building a home-grown DQ solution. Each resource represents something different from core-organizational data functions like pipeline, ETL, or analytics. In my personal experience as a DQ product owner, I have seen data product builds take years – not months. Could you justify the below cost? If not, would you then be hiring less than adequate resources, failing to even deliver a ‘working software’?
If it is proven difficult enough to build, imagine then the time to maintain data quality. Almost immediately, any organization that transitions from build to run has to make tough resource decisions. In many cases, the team that builds such a product finds themselves questioning their role in a run mode. What allocation goes to new product features vs. maintenance? Do our best resources leave??
While this product team struggles to become a data operations team, two factors remain constant. The organization will change (see the next section), and the competition will increase. Yes, you build the perfect product today, but can you keep up with the COTS of tomorrow? Especially when those offerings are solving unforeseen challenges your organization has not faced? Imagine if your CIO suddenly announces that new?cloud migration?requires new features and you’re already behind schedule??
领英推荐
Would every new refactor, feature, or upgrade cost as much as the previous projections? Maybe! At the very least, consider the below graphic in comparison to a SaaS/Cloud DQ offering.?Upgrade installations alone can cost you ~$70K annually. Those resources and money far exceed the typical vendor “hosting” cost of a COTS DQ product.
2. Organizational complexity – Can you really out-technology your organization’s people and process?
A critical reader could cite exceptions to the above data, recognizing that a good anecdote challenges facts. The anecdote is often a great way to qualitatively disrupt quantitative analysis – typically prefaced with a “we’re different…” As a former DQ product owner, I welcome that challenge and probably acknowledge its truth, but I would posit that the sum of those anecdotes does not outweigh the greater cost-benefit analysis.
Perhaps we could consider some of the challenges relative to organizational complexity.
Given the above, one could think of many challenges to creating an asterisk in the build-vs-buy data. While these notions may hold some truth, the sum of many anecdotes and challenges does not outweigh the history of COTS prevailing – “economies of scale.” I would then leave you with one anecdote to challenge the previous.?
We’re?not?different. Something broke?–?Yes something will break in Data quality. As you wait to make your decision or take the time to build your own DQ, things will keep breaking. I’ve seen DQ incidents tip into the 8 figures. So, while you weigh your decision, time is of the essence to deliver DQ. Do you really have time to wait for a team to build DQ when buying DQ is proven faster time-to-market?
The data suggests one should strongly consider buying a DQ solution built by a DQ-focused company before building their own. This should not be an existential question for development teams as there is plenty of customization data engineers can engage into to make healthy data pipelines. If you’re looking for a DQ solution that services Engineers, Operations, and Business, check out?Collibra Data Quality and Observability.
For more please check out this article on https://www.collibra.com/us/en/blog/bridge-to-nowhere-build-or-buy-a-data-quality-solution
Artificially Intelligent. Bringing together people, ideas, and data. I am because we are.
2 年DQ question? Are we building a bridge ?? or floating a boat ???
Artificially Intelligent. Bringing together people, ideas, and data. I am because we are.
2 年prioritize DQ time-to-market. Eric Gerstner 3 year TCO of built DQ solution = $4M After 3 years the capabilities of in house DQ are 1/3 of DQ leaders’ product function, feature, stability, … What is DQ ROI?
Taming AI with AI Governance | Complying with EO 14110 & OMB 24-10 | Making data migrations wildly successful
2 年Hardest working man in Data Quality? Yes! Great content Eric!
https://www.collibra.com/us/en/blog/bridge-to-nowhere-build-or-buy-a-data-quality-solution