Are You Ready for Data Marketplaces?
Here's an interesting question. What percentage of the data you consume at work every day originates outside of your organisation? The chances are it's a significant amount, comprising a combination of your customers' data, vendor data, data from your environment, and of course some of your original data as well.
I'm sure by now you've had your fill of hearing about data - big data, data lakes, puddles, oceans, and all forms of analytics and insights - from historical to real-time to predictive and prescriptive. But here's the kick - it is highly probable that all the focus on data is on enterprise data. But we're here to talk about something else today. Specifically about data moves between organisations.
There are 3 typical features of how businesses exchange data. Think about the data exchange between a supermarket and a major consumer goods brand, or even an automotive OEM and a tyre manufacturer. First, there's usually an existing commercial contract underpinning the relationship. The data-sharing supports an existing relationship of goods and services exchange. Second, this is usually a bilateral arrangement - both the contract and the data exchange. And third, the data is clearly known and defined. For example product or service specifications, stock position, or invoices, orders, rate cards, and so on. But what if none of these three conditions applied?
A perfectly good scenario to consider is scope 3 emissions reporting. For scope 3, every organisation has to be able to report on its upstream and downstream activities, which would mean vendors and clients. Just think about this for a second, GSK has 36,000 suppliers according to their annual report. Sainsbury's has over two thousand suppliers. All of these suppliers have to provide their emissions data to customers such as GSK, and Sainsburys. In fact, quite a few suppliers may be common to GSK and Sainsburys - technology providers, telecom providers, and many others. So we have a many-to-many scenario, rather than a bilateral model. The next feature of this is that no agreement to share this data currently exists. So the rules or constraints governing sharing or consuming of this data haven't been defined yet. And finally, we don't even know at present which data will be relevant in future. And whether new regulations, technologies, or data sets will emerge and change the landscape. You can see why the traditional approach of defined, contracted, and bilateral data exchange doesn't work here.
What these kinds of environments need, are data marketplaces. A marketplace has the ability to support discovery, description, negotiation, contract, and transfer of the commodity - in this case, data. The marketplace supports the scenario where you don't necessarily know exactly what you're looking for. To take a trivial example, imagine that you're cooking dinner for friends and you have a menu in mind. You could just go to a website and order the specific items you want. Or maybe you haven't made up your mind about whether to use fruit in the dessert, so you could wander to the market and discover that fresh blueberries were available, and it strikes you that a blueberry pie might be a good addition to the menu. The marketplace model similarly allows you to discover datasets that others have made available which you weren't necessarily looking for but can be valuable. As data proliferates in almost every business, the external value of that data will become much more visible in a marketplace model. Here's a good example:
Henning, a fantasy football buff from Norway spent a lot of time constructing a list of 'insiders' for each club. These were all people from each club playing the fantasy football game. They included players and non-playing staff. What Henning had figured out though, was that if a player was injured, the first people who would react by dropping him from their fantasy football team would be the insiders from his club. And so Henning was able to disclose with high accuracy which players were injured long before clubs announced their squad for the weekend. And this was causing football clubs a lot of problems - ie. their closely guarded squad news for actual games was leaked, even though Henning was just doing it for the fantasy league. This is an excellent example of the externalities of data. (and also, of course, of insider trading!). But these data externalities exist everywhere. Your business energy consumption may well be a surrogate for your financial performance. There may well be hundreds of data sets out there which might be extremely useful for your business - but you don't know that they exist, what they might cost, or how to access them. What you need is a catalog - which is really a version of a market.
From a users' point of view, there are some more?perspectives to think about. Do you need to query the data set for a specific point? Or do you need the whole data set? For example, you might want to check the weather on a Sunday morning before leaving home, if you just want to know what to wear. On the other hand, you might be designing smart weatherproof clothing, so you want the entire data set of the past 5 years of weather data. In the first instance, you can just query the data set. In the second, you need access to the whole set. Also is the data highly variable (e.g. stock position, weather in London), semi-permanent - e.g. train time tables, or reasonably permanent - say the data from the periodic table.
领英推荐
We spoke about scope 3 emissions earlier - what are the other areas where you could see a data marketplace? Actually, we think there are any number of domains. The automotive industry is a particularly interesting area with manufacturers, service providers, secondhand sellers, insurers, and civic authorities all participants in the marketplace. What about city data marketplaces - where all urban data can be pooled and collected - already somewhat enabled through the London Datastore. And what about macroeconomics? A while ago, I read an article (which I just can't find any more!) about a firm that was tracking the US economy through the sale of bottles. Why bottles? Well, these were bottles supplied for drug testing, typically used by firms when they put new hires through a test. So the bottle sales became a useful surrogate for hiring volumes, which was a lead indicator for the economy. And it's this characteristic of data - to transcend industries and meaning that makes marketplaces especially useful. Other potential areas include healthcare (NHS just launched a Data Register), natural resources, and research - especially in computational biology.
My colleagues at TCS research have actually created a data marketplace product (DeXAM), which allows the setting up of a data exchange with discovery, description, pricing, and transactions. We just finished a proof of concept for this with a government agency in the UK, in a very interesting area around domestic pet safety. And I have no doubt that as ecosystems proliferate, the need for such marketplaces as foundational elements of ecosystems will surge.
Cast your mind over the past 24 months. We have just been through an unprecedented spike in research around Covid19, epidemiology, RNA, and vaccination. Masses of new data have been created and so much is still left to learn. Scientists are still grappling with what questions to ask. Imagine in this environment a marketplace for Covid19 related data. So that a research team from a pharmaceutical company who have gathered data about the efficacy of vaccines on under 18's can share this data and a government agency looking to create a policy for school opening can access this information to validate their own modelling on the impact of keeping schools open (or closed). In last week's newsletter, we talked about Nils Bohlin's seatbelt invention and the decision to give the design away so everybody could be safe. The same can apply to the data being collected all over the world. Making it available for the common good (or even at a price) will greatly speed up the rate at which collaboration can take place across the international ecosystem of healthcare and public agencies. And this can only happen through a marketplace model for data.
And maybe the ability to commercialise datasets will pave the way for us to also start monetising our personal data and create a more equitable distribution of the value of data which currently accrues only to advertisers and data gathering intermediaries.
Adapted From the IEX newsletter?click here?to receive it every Wednesday