We have disturbingly little idea how many of the algorithms that affect our lives actually work. We consume their output, knowing little about the ingredients and recipe. And as analytics affects more and more of our lives and organizations, we need more transparency. But this transparency may be a bitter pill for businesses to swallow.
In 1906, Upton Sinclair’s The Jungle described the oppressed life of immigrant workers, specifically those in the meatpacking industry in Chicago. Sinclair’s intent in portraying the working conditions of a powerless class may have been to inspire political change. However, the graphic depictions of unsanitary food preparation helped bring transparency to manufacturing processes through the story’s nauseating clarity. The book heavily influenced the creation of regulatory oversight through organizations that eventually became the U.S. Food and Drug Administration.
We might be similarly horrified if we knew what evils lurked in the hearts of business algorithms in use today.
Some examples are lesser evils. Google search is widely used, but details about the order (and inclusion) of pages in its results aren’t public. Credit scores directly affect our finances, but the specific algorithms used to calculate them are secret. And the use of analytics to create algorithms is spreading rapidly to judicial processes, advertising, hiring, and many other daily decisions.
But these are the oxymoronic obvious unknowns. There may be greater evils lurking beneath the surface. The internal operations of businesses have always been a bit murky to consumers. There are algorithms in use within organizations that we as consumers don’t know that we don’t know about — preferential treatments, pricing differences, service prioritization, routing sequences, internal ratings, and so on. There is little opportunity even to know these algorithms exist, much less the analytical results on which they are based.
It actually makes sense that we lack good ways to see how analytical results are produced. Companies want to protect their intellectual property — this is their secret sauce. Whatever advantage companies get from data does not come without effort. Given the considerable investment underlying that effort, companies would certainly be reluctant to give away their hard-earned insights embedded in algorithms. Why would they even consider it?
The difficulty, as in The Jungle, is that others consume what is produced.
With food, we likely would be quite reluctant to want to go back a century to the time of The Jungle — the era before food labeling and inspection processes were required. The bane of every school cafeteria is the dreaded “mystery meat.” We don’t like mystery in what we consume.
With analytics, we are in a Jungle scenario. Businesses create analytical results that affect our lives, but we don’t know much about the ingredients or recipe. What data is used? From where? How are the models created? What affects the resulting decision?
Some aspects of the mystery of what we are consuming falls squarely on our own plate; a lack of knowledge can stem from a lack of effort to understand analytics. While democratization of data is appealing, the new data republic is a meritocracy. Other aspects of the mystery of what we are consuming, however, are currently unknowable. Businesses have little incentive or motivation to share information about the algorithms they use. And, as a result, they will not provide details without a change in incentives or awareness.
I’m certainly not advocating for new agencies to regulate and inspect all algorithms. Institutionalization brings with it outside influence, power struggles, and lobbying. There are risks in both over- and under-regulation. The point is that we’ve been in similar situations before and can learn from them. When it came to food and medicine, significant regulation — and the FDA — came about as a result of the lack of self-regulation by the companies producing either product. For algorithms, better self-regulation and transparency may preempt the same sort of government regulation that evolved in the food industry. At a minimum, from a perspective of self-interest, the long memory associated with cheap data storage indicates that business secrets won’t last forever anyway.
This is especially true with the rise of deep learning and artificial intelligence techniques that can be opaque even to their developers. A system that passes a Turing test, by definition, hides the details of how it works from those it interacts with. The lack of information about the analytical results is getting worse, not better.
Considerable effort goes into improving data quality. “Garbage in; garbage out” is frequently repeated. But while data may be dirty, algorithms are dirtier. With more transparency into the algorithms in use, we can have informed discussions about what may or may not be fair. We can collectively improve them. We can avoid the ones we are allergic to and patronize the businesses that are transparent about what their algorithms do and how they work.
A side effect of the insight into food processes was the collapse of the market for lemons; consumers wouldn’t purchase suspect ingredients or elixirs with dubious claims. Similarly, we’ll likely find that some businesses are covered-wagon sideshows selling snake oil, and we can knowledgeably avoid the results of their unfair or sneaky algorithms.
MIT Sloan Management Review