The fallacy of the generic AI startup
In the 2016 book "The Inevitable", Kevin Kelly, editor of Wired, wrote "The business plans of the next 10,000 startups are easy to forecast: Take X and add AI." Five years later, we are surrounded by AI startups, in different forms and shapes. Many of them fall into the category of AI for X, where X can be a specific business vertical or a functionality common to many verticals. They might very well become successful, but the topic of this article is the other AI startups, the generic business-to-business (B2B) AI startups, and why I believe few of them will succeed.
The AI value proposition
In order for a company to sell its products or services, they must have an appealing value proposition. Customers must be able to see how they can buy from the supplier, how they can integrate what they bought into their business, and how they can get a return of investment. Return of investment can materialise in increased revenue from new or better products. It can also materialise from decreased costs to produce the current line of products. In these days of AI hype, it is easy to imagine how AI technology can create new types of products or new attractive functionality for existing products. It is also easy to see how AI can cut costs, since AI technology that can perform intelligent functions that we previously thought would require human intelligence. The cost of human labour is a large share of product development and manufacturing, so there is great potential to save money or free up human resources for more valuable work.
That brings us to the missing link - the ability to integrate a generic AI product into the business flow. Why would this be so difficult? We have heard of so many companies making heavy use of AI in their products. Although many companies experiment with AI technology, sustainable return on investment from AI is still limited to a handful of highly technical companies. This high-tech elite is very data mature, which is the key to AI success. Almost all AI today is built on machine learning technology, and machine learning feeds on data. The ability for an organisation to work efficiently with data is a prerequisite to integrate an AI product, but organisational ability cannot be packaged and sold.
Value from data
It has now become widely accepted that there is unrealised business potential in data. There are three main categories of data value extraction:
- Analytics - collecting and refining data to improve human decision making.
- Data-fed functionality - collecting and refining data by traditional processing logic to feed functionality that improve products, e.g. search functionality, reporting, or information propagation to other user-facing systems.
- Machine learning (ML) - processing logic that has been trained from examples to perform tasks or make decisions where traditional logic is inadequate. When machine learning is sufficiently impressive to come close to human performance, we call it AI.
In data mature companies, these three activities are all developed on a data platform, an internal shared technical infrastructure to facilitate data processing. A data platform can have many capabilities, but almost every data platform includes a data lake for storing and sharing data, and data pipelines for scalable data processing.
When using a data platform to extract data value, the data goes through a refinement lifecycle:
- Event generation - events of business interest are recorded in the source system where they occur.
- Data collection - events or snapshots of business state are transported from source systems to the data platform to be stored in the lake.
- Cleaning - depending on the use case, they may need to be sorted on creation time, cleaned, repaired, normalised, stripped from personal data, combined with other events, decorated with complementary information, etc.
- Domain-specific refinement - low-level events are combined into business-level events, such as browsing sessions, shopping funnels, signup sequences, etc.
- Feature extraction - business logic is applied to form business events of particular interest, e.g. abandoned shopping carts.
- Application-specific processing - business logic specific to a particular application or functionality is applied.
- Use-case-specific processing - in the final processing step, data is prepared and denormalised to be used for a particular purpose.
- Serving - prepared data is transported outside the data platform to systems suited for serving the results to internal or external users.
- Feedback measurement and iteration - performance and quality metrics are collected in the whole data lifecycle, i.e. on ingested data, during processing, and in the data product user interface. These metrics are processed in complementary quality data pipelines and presented for guidance on how to evolve data-fed and machine learning functionality.
For machine learning product functionality, steps 6 and 7 involve model training. Model inference happens in step 7 or 8. Step 5, feature extraction, is often associated with machine learning, but it is applicable for any data refinement. Extracted features are tied to the business domain, and reused between ML and non-ML processing.
Although steps 6 and 7/8 might be complex, the amount of effort required for these steps is small in comparison to the whole chain, even for machine learning products. The relationship between machine learning efforts and the surrounding activities is described in detail in Google's academic paper "The hidden Technical debt in machine learning systems" from 2015. The image below is a reproduction of a key illustration in that paper.
The gist of this image is that while machine learning is complex, the efforts spent in the surrounding systems dwarf the ML-specific efforts. For data pipelines that serve analytics or data-fed functionality, the efforts are distributed in a similar manner, but without the small, black ML box. Machine learning is in practice complex to develop and operate, and non-ML data efforts tend to yield higher return on investment. Therefore, in most data mature companies, a minority of data pipelines serve ML functionality, and a data product portfolio typically has the proportions of the illustration below.
A couple of years ago, IKEA happened to provide a good example of how much customer value potential from data is left on the table, and can be realised by simple means, without AI. Most data value comes from straightforward automation and from analytics, rather than from AI. This might eventually change, but outside the technical elite, it lies far into the future.
The generic AI product value proposition
A generic AI product can help a customer primarily with the effort in the small, black ML box. In order to get value from a generic AI product, the other pieces of the puzzle have to come together through other means. One founder of a generic AI startup told me that he hoped that those could be automated, even though the box sizes illustrate what Google was unable to automate. An employee of another startup stated that the customer is expected to provide them.
So, for a generic AI product to provide value, customers have to be capable of developing and operating steps 1, 2, 3, 4, 5, 8, and 9 above. But they should not themselves be able to solve 6 & 7 efficiently for AI functionality, and they should wish to bring in a piece of new startup technology. But for other data functionality they do need to be able to solve 6 & 7, or miss most data value opportunities. These requirements are so specific that such unicorn customers are very rare.
If a generic AI product addresses a non-technical customer audience, they are not capable of solving the other boxes. Some generic AI products instead target technical customers, but technical companies are able to solve 6 & 7 using major cloud provider products and open source components. With such competition, it is difficult to establish a unique selling point, and to charge sufficient money to sustain a business.
This is why I believe that most generic AI startups will fail to create return on investment for their customers. Some will manage to make a successful acquisition exit, but not live up to the expected customer impact. If we as entrepreneurs with suitable skills cannot have impact by creating a generic AI startup, what could we do instead?
Valuable AI product propositions
There are many companies out there with great unrealised potential in their data, but who cannot or should not recruit and build internal capability to extract the value. There are also skilled entrepreneurs and engineers that are veterans from data mature companies, and are capable of extracting data value, but do not want to work for the companies that have the raw, valuable data. What could a B2B AI startup do to be valuable to its customers?
In order to answer that question, it is important to understand that very few companies are capable of handling the full data value chain with an efficiency anywhere near the level of the technical elite. The width of this gap is underestimated, from both sides of the gap. I have spent years crossing the chasm back and forth, going to different companies. In Scandinavia, almost all companies still have years of work ahead of them before they get to the data maturity level where Spotify was when I joined the company in 2013. The book Accelerate by Forsgren et.al. illustrates a similar efficiency gap in the realm of DevOps, with measured quantitative efficiency differences of ~100x between the elite and the trailing companies. In my experience, the span of DataOps differences is similar.
Considering this chasm and looking at the full data value chain of steps 1-9, the answer stands out: B2B data and AI startups can contribute by taking responsibility for the whole data value chain.
Specialising in a business vertical or a functional niche is one plausible strategy for taking responsibility for the full value chain. As a niche provider, you can make assumptions on the context. For example, an AI product for e-commerce can integrate with Shopify, Salesforce, Google Analytics, or other data sources likely to be used by their target audience. I expect that many specialised B2B full value chain AI startups will be successful.
A generic full value chain proposition is more difficult to deliver. AI/ML value generation is entwined with data value generation. An ML-only solution leaves the customer with a more pressing need for a generic data solution, which would have significant functional overlap with the ML-only solution. For technical customers, there are comprehensive data platforms from the cloud providers and the early big data era startups, e.g. Cloudera and Databricks. Competing directly with these is possible, but requires great momentum and critical mass of skills that is low in supply and high in demand. Although there are ML-focused end-to-end platforms, these are parts of larger platforms, e.g. Kubeflow and MLflow.
It is possible for a startup to help technical customers by supplying an isolated component in the data value chain. There are many areas where innovation would be welcome: workflow orchestration, monitoring, quality control, governance, etc. These components are typically not ML-specific, however. Today, such startups often brand themselves as AI-something, but they solve generic data pipeline management problems, which are equally applicable to reporting as deep learning.
Buying data and AI solutions
If we shift the perspective around, and put us in the shoes of a non-technical or semi-technical company that would like to tap into the value of data and AI. What are our options?
- We can stick with the vendors and the workflows that we already have, and take what they offer that matches the way we are used to work. It will anchor us among the trailing companies, one or two decades behind the elite. Many trailing companies will survive, but run risk of disruption from data-driven companies.
- We can choose one or more vendors that create data or AI value for our business vertical or in niches that match our business. This is likely effective where there is a good match. The data will be in silos, however. Combining data between the silos will be difficult and there is no easy solution for use cases outside the silos.
- We can build a comprehensive data platform and data flows, buying components from cloud vendors or data vendors. But the components are building blocks and require significant technical competence to connect together, customise, and operate in order to deliver valuable data refinement.
The technical elite companies have all chosen the last strategy, the data platform. It enables data democratisation and maximises data innovation ability. For companies that have the competence, it is the natural choice. It is a high risk undertaking, however, and the list of failed enterprise Hadoop and data lake projects is long. While technical competence can be recruited, adopting new patterns of team organisation and collaboration is often an insurmountable challenge. Engaging consultants to build the platform is unfortunately also a risk-prone undertaking. Data refinement can only be learnt at product companies, who have real, live production data. With time, big data competence will migrate through osmosis, but in Europe, there has not yet been sufficient movement of people from the technical elite companies to consultancy companies to satisfy the need.
For companies outside the technical elite, there is no clear nor easy path. I have seen and helped many companies make progress, but the large gap in terms of efficiency and innovation capability between the data leaders and the bulk of companies is not closing.
We need new collaboration models
What could we do then? I am convinced that we need to find new ways to collaborate between the companies with data potential and the niche startups that have the competence to release data potential. I founded Scling to explore a better collaboration paradigm. Scling provides data value extraction as a service, where we work in collaboration with customers to generate customer business value from their data. We provide a technical data platform, but also the workflows and data product development tactics that we know to work from experience.
Scling manifests one new collaboration model, but as a community, we need to explore other ways to slice the problem as well, beyond the traditional consultancy, platform, or product models. I would like to mention DataKitchen as another example of a company trying out a novel collaboration model, similar to Scling's, yet different. The common factors are a strong focus on DataOps and a clear value proposition. I believe that a clear and realistic value proposition is where the generic AI startups fall short by expecting too much from their customers. I hope to see more startups with new experimental value-oriented collaboration models in the future.
This is absolutely true. Have seen allready generic ai startups failed and go out of business
Google Cloud Champion Innovator | Speaker | I write to 16K followers about data engineering and data reliability
3 年Lars Albertsson great post! I’ve been convinced for a long time now that as long as ML is the AI foundation, data will be 10x important than the algorithm. Everyone has access to the same algorithms, what is unique is the data and that usually doesn’t belong to the AI-startup (unless huge amounts of labeled data is your AI value proposition). Many generic use cases will likely just treat predictions as an API you feed with training data and then inference input and get back a prediction. The uplift from handcrafted models will be so small that shorter time to market and less technical debt/complexity and more time to improve data is in favor for the generic API-solution offered by the giants. Data engineering will be central, but there will also be much value in data/AI product mgmt and business domain knowledge to iterate fast and solve real business problems (resonates with your point niches). Few will compete with likes as Databricks and the big cloud providers, but there is definitely opportunities to stand on shoulders of giants as their offerings are very generic, especially if you can pivot that into a niche
Technology Consultant, System Analyst
3 年Super good article, very relatable! Couldn't agree more on the fallacy of generic AI platform. Extremely hard! I'm curious to read more! As scaling and availability is still always one interest point for any company, how did you identify and implemente the parts of the data platform and workflows that are reusable for every new customer?
Data scientist p? Mynewsdesk
3 年Interesting read. Thank you. In NLP, the unique and valuable data sets are probably also getting scarce. The large language models engulf efforts made with smaller datasets. This also invalidates some business propositions.