Want the Best Results From AI? Ask a Human
Companies of all kinds are adopting artificial intelligence (AI) and machine-learning systems at an accelerated pace. International Data Corp. (IDC) projects that shipments of AI software will grow by 50% per year and will reach $ 57.6 billion in 2021 — up from $ 12 billion in 2017 and just $ 8 billion in 2016. AI is being applied to a range of tasks, including rating mortgage applications, spotting signs of trouble on power lines, and helping drivers navigate using location data from smartphones.
But companies are learning the hard way that developing and deploying AI and machine-learning systems is not like implementing a standard software program. What makes these programs so powerful — their ability to “learn” on their own — also makes them unpredictable and imminently capable of errors that can harm the business.
AI’s Challenge: It’s Susceptible to Learned Bias
We frequently hear stories of AI gone awry. For instance, lenders are grappling with AI systems that unintentionally “learn” to deny credit to residents of certain zip codes, which is a violation of bank “redlining” regulations. In Florida, a program used by a county’s criminal justice system flagged African Americans who were arrested as more likely to commit another crime than whites, even though the rate of reoffending is the same for both groups. Or consider an online translation program that, when asked to translate the phrase “She is a doctor, and he is a nanny” into Turkish and then translate it back to English, spits out: “He is a doctor, and she is a nanny.”
These bias-induced situations can have serious business consequences. When AI was being used in back-office applications, the chance of bias creeping in was limited, and so was the potential damage. Now AI is being used extensively both in management decision support and customer-facing applications. Companies risk damaging people’s reputations and lives, making strategic wrong turns, offending customers, and losing sales. And the cost of AI mistakes — whether they come from bias or flat-out error based on unreliable data or faulty algorithms — is rising.
The lesson here is that AI systems, for all their amazing powers, still need continuous human intervention to stay out of trouble and do their best work. Indeed, companies are finding that they get the most out of investments in AI and other automation tools when they think in terms of humans and machines working together, rather than dividing work between humans and machines and letting the machines operate independently.
When conventional software is installed, procedures and rules are set in stone by human developers. By contrast, an AI system develops its own rules from patterns in the data it is crunching. And, as some companies have learned through real-life situations, AI systems can jump to the wrong conclusions.
Three Guiding Principles for Successfully Adopting AI
Therefore, before diving into AI systems, companies should consider three principles that can greatly improve the chances for a successful outcome:
Principle 1: Humans and machines are in this together. Nowhere is human-machine collaboration more relevant than in installing and maintaining AI systems. Human assistance is needed to teach and monitor AI systems properly and keep them from drifting into dangerous territory over time. This is not a job for IT departments alone — it requires both technical expertise and business understanding.
Training and monitoring the ongoing performance of an AI system requires that employees who are experts in software collaborate with colleagues who rely on AI systems to do their work. As users of AI output, these colleagues are in a position to spot changes in how the program is performing and can act on any issues that arise. As well, while a road-mapping application might use AI to plot efficient driving routes, a human driver can override the system’s choices based on knowledge of rush hour patterns or road construction.
Principle 2: Teach with (a lot of) data. AI systems learn by finding patterns in training data through various algorithms. Typically, this is done with historical data and involves experimenting with different models. The trained models are statistically evaluated, and the best-performing model is selected to be deployed into production.
This means that AI has a lot to learn. For example, a business often needs to evaluate how the brand is doing on quality and service based on unstructured data like comments on Twitter, news stories, Facebook posts, online reviews, and the like. The model must be trained with real-time data and programmers (or ordinary employees who have learned how to train AI systems) and taught rules that the program would not pick up on its own. Programmers would have to teach the system how to understand the true meaning and validity of consumer comments.
Machines, for instance, don’t understand sarcasm (although Israeli scientists say they have developed a program to identify sarcastic comments). Other challenges in accurately parsing user-generated content include interpreting specific words differently depending on the context — “hot” would have a positive connotation in the context of food but could have a negative sentiment in the context of how comfortable the restaurant was.
Training can be labor-intensive up front, but with a well-structured methodology for developing unbiased training data, training time of the AI system can be reduced by 50%, according to Accenture’s internal research.
Principle 3: Continually test the results. With AI programs, testing is not only critical prior to release, but also becomes an ongoing routine. Managers need to be confident that the system will deliver accurate results from a variety of data.
Traditional software testing is determinate — you need test only a finite number of scenarios. Once the program has been tested for all possible scenarios, it is guaranteed to work. But with AI and machine learning, you can’t predict every scenario. You must continually monitor and test the system to catch data biases as well as biases that develop in the algorithms that the programs use to make judgments.
You can test for data bias by using more than one set of data — for example, a loan-application system based on historical data will only perpetuate the biases inherent in that data, which will likely show that members of certain groups in the population have not qualified for loans. To correct for this bias, the system must be tested and retrained with additional data (Accenture has created an automated system for creating alternative scenarios to correct such biases). For instance, to make sure that the algorithm that monitors consumer sentiment about your brand is working properly, you can test the same set of data with different algorithms.
The New Normal: Teaching, Testing, and Working With End Users
Effectively deploying AI requires a new conception of how software is developed, installed, and maintained. Teaching, testing, and working with end users of AI output must become a way of life, enabling AI systems to continually operate more responsibly, accurately, and transparently — and allowing businesses to create collaborative and powerful new members of the workforce.