Why the Data Marketplaces of the Future Will Sell Insights, Not Data
As director of MIT’s Institute for Data, Systems, and Society (IDSS), Munther Dahleh oversees multiple research projects that use AI and data science to tackle challenges in energy, finance, health care, and urban environments. He views data as an important asset across many domains, including business, but argues that we’re lacking sound ways to verify data’s value.
Dahleh, who is also the William A. Coolidge Professor for Electrical Engineering and Computer Science at MIT, says the technology for a better data marketplace is already under development. The IDSS is creating a cloud-based “Data Lab” that will store data sets securely and offer data processing services to individuals who want to make use of those data sets. This same Data Lab architecture can be used for future commercial data marketplaces, Dahleh says.
MIT Sloan Management Review spoke with Dahleh about why current data marketplaces fall short, how a more efficient one would work, why businesses may overestimate the value of their big-data caches, and how the financial value of specific business insights will determine what a collection of data is really worth. MIT SMR’s Elizabeth Heichler conducted the interview, and what follows is an edited and condensed version of their conversation.
MIT Sloan Management Review: You’ve said that the current market for data is inadequate — that pricing is illogical, and there’s no verification that particular data has value. What do you consider the hallmarks of an efficient market for data?
Dahleh: A market has two sides — buyers and sellers. Prices are decided based on demand and supply, an equilibrium of some sort. If the market is designed well, then you trust the pricing of the market.
Today, there’s no data market that operates like the stock market, or the online ads market, based on auction strategies. And for this to happen, people have to buy into the market, and share their data, and trust that the market maker will actually compensate them appropriately for that data.
How would a business that needs data for decision-making use the data marketplace you envision?
Dahleh: Here’s an example: You might come into the market because you’re a retailer and you’re trying to predict your inventory. You describe the prediction you need, choose the prediction algorithms you want to use, and place a bid. The market evaluates your request and informs you what relevant data is available. For the amount you bid, it tells you what data you have access to and provides you with the prediction task you requested.
So, if I am that retailer, I am buying the inventory prediction, not getting the actual data used for that prediction?
Dahleh: Correct. Buyers get the benefit of the data — they don’t get the data. If you want to use that data for other analytics, you go into the market again to bid for that.
How does selling insights such as predictions, rather than selling the data, result in a fairer valuation for data?
Dahleh: Right now, companies that are getting into the business of trading data are buying and selling data sets, with a fixed price. Buyers don’t know how much financial value the data is going to provide, or if it will provide value at all. If you don’t know what value a data set will provide, I can’t tell you how much you should pay for it.
But we can price the value of a prediction. Getting back to my retail example, if you’re trying to predict your inventory, every improved percentage of prediction translates to a dollar value for you. If I tell you, “This is how many jeans I predict you’re going to sell,” you know exactly how that translates into your bottom line. You know exactly how much an accurate prediction is worth to you, and that determines what you’re willing to bid for it. When you make a fair bid that reflects the value of the data, the market can price the data appropriately.
What other advantages would a business gain by buying a prediction on this data marketplace rather than outright buying and owning a potentially useful data set?
Dahleh: The advantage is accuracy. A marketplace centered around the value of data incentivizes people to collect better and more useful data. The market will pay more to those whose data contributed more to the prediction task, so I’ll get the most out of selling data by contributing the best data sets. That does incentivize people to think about offering useful and important data for the appropriate prediction task.
Today, a lot of companies are storing and saving tons and tons of data, and they hope that this data is going to translate into their bottom line. I think it is going to be a surprise to a lot of people that much of that data is just useless. It becomes stale too quickly to be effective.
How are you working toward realizing a commercial data marketplace?
Dahleh: We have developed a theoretical and computational framework for designing an online data marketplace. We are in the process of prototyping a test bed with a specific application, and plan to open it up to businesses to try it out to see whether it works. We’re considering various options for the application, and we’re collaborating with our industrial partners to set this up. The markets for data will be specific to domains, such as retail, transportation, or advertisements.
We’re talking to companies about this to get their buy-in. You know, it took Google many years to establish the online ad market. This is not too much of a paradigm shift — I do think that we are moving toward adoption of data marketplaces.