Data is dead: why the data-driven enterprise is DOA
Have you seen the acres of glossy blurb on how the data-driven enterprise will look like in 2025, or what the Big Data Trends are in 2023? I'm done with it. It's a whirlwind of wishful thinking. Those of us who have their hands in the dirt (and their feet on the ground) know what I mean. Yes, data is growing and sure, we all have fantastic technology and boy, the possibilities are endless. But that doesn't mean it's going to work, does it? Fact. Reality.
"By 2025, smart workflows and seamless interactions between humans and machines will probably be as standard as the corporate balance sheet, and most employees will use data to optimize nearly every aspect of their work."
Sure. Sweet dreams. I'm not usually so blunt, but now I’m hurling the rock straight into the pond (should it be lake?)... just watch those CAPITAL LETTERS. This is why I think the data-driven enterprise is just more pie in the sky, at least if we carry on letting happen what we’re letting happen.
<rant>
1. Organisations’ data are a pile of shit
Virtually without exception, EVERY organisation is struggling with a huge data debt (ref: Ronald Damhof ). This is in and of itself a huge problem, meaning that the preconditions simply do not exist to innovate or to do smart things with data.
In all the stories about AI, ML, data analytics or data science, there is one glaring omission: there is no recognition of businesses' actual situation regarding their data and its organisation. The assumption is that all the data is neatly organised, available, equipped with the right definitions, of the right quality level and neatly organised on Ikea shelves, just waiting to be efficiently used and analysed.
Wake up and smell the coffee. Talk to a random data scientist in your organisation, and they will tell you that they spend 80% of their time on data preparation. But hey, he’s working Agile and is just finishing his cleansing script for the next sprint review!
Not only is this a huge preconditional problem, but there is also a lack of priority, people, and resources to solve this structurally. Oh, I get that this kind of work is not sexy and it’s expensive; it also takes place under the hood, so it seems to the wider organisation like nothing’s happening. So we don’t do it, and we keep bouncing from solution to solution (“We are going to build a data lakehouse”), solving a problem or organisational issue in the short term while simultaneously, and ironically, adding to the total data debt in the long term.
"But it works, right?" I hear you cry. Yes, it works, but the data processing is built MacGyver-style with 30 meters of duct tape, is not scalable, is all in the heads of two engineers in some far-flung country, and the verification module runs locally on someone's laptop. In what way is this Agile? "But the code is nicely versioned in Git," an engineer assured me last week.
[…quietly leaves the room…]
How well it all works becomes clear when a change needs to be made or all the related data from one particular client needs to be pulled up. A question about the origin of a certain piece of data or where the data of a specific person is located often reveals imperfections in data management. In what way is this flexible? And NO: it’s NOT because of the technology you have in house - see paragraph 3.
IT’S... ONE... HELL... OF... A... MESS
2. Higher education produces the wrong professionals.
Let the comments come - there are sure to be a lot of people who have plenty to say about this.?
In recent years, colleges and universities have massively invested in data training programs. Bachelors, Masters, and post-graduate courses have sprung up like mushrooms. But let's be precise: all this stuff is about data ANALYTICS and data SCIENCE. The primary focus of these courses is the USE of data. NOT how the data originates or should be managed. That's a huge difference.
Take a random curriculum and the majority of the courses relate to programming (Python, R, etc.), analytical techniques (ML, calculus, linear algebra, statistics),?and sometimes ethics and privacy. If you’re lucky, half a day on data management & data governance.
But these are NOT the knowledge and skills needed to clean up the data debt and prevent it from happening in the future. Information and data modeling, systems theory, systems engineering, organization theory & design, design science, quality management, (predicate) logic & set theory, semantics & semiotics, time in databases etc. Crucial knowledge and skills, parts of which are not (let alone in combination) taught anymore in higher education. I could easily write up a whole curriculum for a Bachelor and Master's program.
So we have a double problem: a huge data debt AND a declining inflow of new qualified people who can solve the escalating problems.
Hate to spoil the party but actually we have a TRIPLE problem. Because future managers and leaders are NEITHER taught what is needed. Look at the curriculum of, for example, MBA programs: Finance (loads of those), Legal, HR, Strategy, Entrepreneurship and, oh wait: Digital (meaning: tech). So it’s great we now have ‘tech savvy’ leaders and shit and they invent all new kinds of business models but those require data too, right?
The question whether an organisation should be data-driven (whatever the hell that means) is completely irrelevant. The world is digitizing and datafying at a staggering pace. Whether you want to be an analytics competitor or do smart things with data; the data is there, in your organisation. And it needs to be taken care of. Just like you need people and financial resources to run an organization, you also need data. For people (HR) and financial resources (Finance), this is completely institutionalised. Everyone understands that you need these business functions because otherwise you can't keep your organisation running: it is the cost of doing business. Nobody will ask for a business case for the Finance or HR function.
It's no different for data. Organisations can no longer afford not to manage it. In other words, setting up and managing data should become an organizational capability, just like Finance and HR. THAT is what they should be teaching the manager of the future at Harvard.
3. Technology is the problem, not the solution
Everybody - suppliers the loudest - shouts in unison that technology is not the solution and yet that is precisely what almost every organisation does first: "Is there a tool on the market?". Or the biggest knockdown argument of all: "What does Microsoft have?" Because the organisation has a Microsoft-unless policy. HELP!!! For this reason, organisations sometimes waste valuable time to find out that Azure Purview is really a *** product but ‘promising roadmap’ and the features needed are ‘planned for the spring release'. So we're still struggling.
领英推荐
"<Vendor X> is excellent at touting a roadmap for their products which never seem to materialise... at least in the timeframe they promise." — LinkedIn Group discussion
Bringing more or different technology into a data landscape that is "one big goat bucket" (dixit Roy Maassen ) is asking for trouble. Technology is bricks but what you need is a new building or renovation. And therefore, skilled and careful design and architecture are necessary. That in turn requires in-depth knowledge of the data, how data flows and how data is used applicatively. It often lacks the calmness and willingness to make it sufficiently transparent.
"But it takes so long...” Indeed, that's because it's ALREADY a mess and organisations just don't (want to) swallow the bitter pill. And so we just keep adding more and more technology. From first-hand experience, I know examples where literally tens of millions of euros have been wasted to set up a data lake ("because the data warehouse had really become legacy") only to find out that nothing has been solved. Hey, but finally we’ve got ourselves a 'modern data stack'.
Do not expect the industry to come up with a solution. They only benefit from selling new licenses or products, not from cleaning up your data shit (incoming angry vendors in 3, 2, 1 ...). Technology is needed to support work that is truly necessary, such as modeling or dealing with time and historical corrections (see point 2 above). However, the market ratios in terms of technology supply are completely disproportionate. The number of different database platforms (ever heard of Yugabyte?) is off the scale but there is no decent modeling software available in the cloud.
Sure, technology solves a lot of problems. Years ago, you had to buy memory cards and wait 3 months to put them into the server; now you just click in the cloud to add 16Gb of memory.
Unfortunately, the downside of rapid technological innovation is that many (proven) design and architecture patterns are ruthlessly discarded. Or more precisely, architecture and design activities are thrown away. Because of the emphasis Agile & DevOps-practices place on developing code instead of paper (meaning: design and architecture), the Holy Grail in realising solutions is to code till you die. The time organisations want or need to take to think about WHAT they want, WHY and HOW it should work, is shifting more and more to (even more) programming time. There is a fundamental flaw in this line of thinking. Implementing a corporate data foundation is infrastructural by nature. Replacing one technology by another to wear your keyboard down even faster is not gonna do it.
4. You're not Spotify, Booking or Adyen
Organisations often hold up younger, more apparently innovative organisations as shining examples of how they want to do things themselves. “If you don't change, you'll be out of business in 5 years” echoes endlessly in management ears. Thus grand visions emerge and huge change programs are initiated. But what often remains understated is the enormous change effort these ambitions imply and what they REALLY require of an organisation.
I have had the privilege of witnessing a relatively young (approximately 20 years old) organisation (between 500-1,000 employees) striving to become data-driven for years. In my opinion, they are doing it by the book, with all due attention to organisational change (ownership by management, training and education, cultural change, new functions, reorganisations, etc.). It...is...FUCKING...hard.
Becoming a data-driven enterprise, requires an organization to dump its current business model, cannibalise existing revenue streams, radically change working methods or renew its entire workforce. OK, I may be exaggerating a bit, but this is what real change means. Doing it a little bit - or even worse: simultaneously - is not going to cut it.
Whether it is necessary for an organisation to become the new Spotify is first and foremost a strategic question, not a data question (see also point 5). But assuming this consideration has been made, facing the Monster of Change is unavoidable. Only then does the desired transition have any chance of success. Hiring 10 data scientists, setting up a Data Lab and purchasing Databricks, won't make any difference (unless that was exactly your goal...).
5. Data is a means, not an end
We’re nearing a situation – or perhaps we’re already in it – where the use of data has become a goal in itself, rather than a means. The whole idea of doing smart things with data implies that data is seen as a hammer and every problem or opportunity as a nail. This greatly diverts attention from ongoing real world problems and societal issues.
Recently, someone from a large municipality spoke about the problem of homeless and vagrants in the city. "Wouldn't it be nice if we had data on all of this?" In terms of statistics, maybe yes, but not - as it was intended - to tackle this issue operationally. Please! A neighbourhood police officer or enforcement officer who knows every street corner is surely much more effective. And a much quicker and easier source of information.
Not every organisation needs to become a data-driven organization, regardless of what is meant by that heady term. The importance of data in an organisation varies greatly. There is a whole spectrum between organisations where the primary product or service is data itself and those where data is only supportive.
Data is present in virtually any organisation. Managing this data is necessary in every case. But the fact that this data exists does not automatically mean the whole business model should be overturned to become data-driven. It reminds me a bit of the 2000s when organisations became nervous if they didn't have a data warehouse. "We're also going to build a data warehouse! Why? Because everybody else is!"
To truly assess how an organisation can use data as a means, it is necessary to have insight into the company's own operations, strategy and related objectives. This seems blindingly obvious, but I have experienced several times that this is not the case. And even if that insight is there, it must also be acted upon. This represents a big blind spot for many organisations. To be able to meaningfully assess the real contribution of data, insight is needed into how processes are currently running, why the organisation works the way it does and how certain arrangements have come into being. Only from those insights can the contribution of data be substantiated. The harsh reality is that organisations do not know or do not want to see the answer to these almost existential questions. Recognising this exposes a sensitive vulnerability.
So the elephant in the room is ignored and attention is focused on the Data Shop where everyone can now easily find their data...
It's about time organisations get back to reality as the next trend is just around the corner, and the next.... (ChatGPT anyone?). What is our mission? What are our core values? What is our value proposition or task? Who are our stakeholders? Only when these questions can be answered specifically and unambiguously, it may be useful to take a look at that data stuff.
Ok, now what?
Nice story, Van Aerle. Now what? Good question. I definitely need another blog (and more) for that. But it would already be a step towards reality if there is more acknowledgement and recognition of the situation in which the profession of data management and the market find themselves now. Less focus on technology, a sense of reality, strategic thinking and recognition that data management is just work that requires a structural approach. If we can make progress on this, then the data-driven enterprise might really come.
</rant>
Data Architect
1 年Peter van Rij
?? Digital Transformation Expert | M365, Power Platform | ERP, ECM, DMS Advisor | r-AI-volution endorser | Public speaker
1 年Lot's of text and no clue + lotsa wrong insight. This is not even worth pasting it into ChatGPT for summary, cos there is none. AI is gonna clarify what people can't. Under the supervision of people with insight.
IT Architect, business solution designer, lecturer, and #MBSE #MDSE, #UML #BPMN #SysML expert. Contractor.
1 年I talk about it to clients and at the university and I'm still like an alien...
Executive Director @ EY | Business Advisory, Solutions Consulting
2 年So much truth here.
Leading Innovation through Data Insight
2 年OK, OK. But tell me how you REALLY feel...?