Since this week’s theme so far is data, let’s keep it going with a profile on Petasense, a startup that offers predictive analytics to industrial clients. Petasense was formed in 2014 with a plan to stop downtime at factories by improving plant owners’ ability to understand when their machines would fail. It built a Wi-Fi-connected vibration sensor that collects data from each machine and sends it up to the cloud for analysis.
The resulting data gets sent back in the form of a health score to plant operators. What Petasense founders discovered was that downtime isn’t why companies were interested in the service. Instead, they wanted to use it to avoid scheduled maintenance on equipment that didn’t actually need it. Now plant operators have the ability to set a customized maintenance schedule for each machine, avoiding the downtime and cost that comes with servicing a machine that doesn’t yet need it.
What Petasense is doing isn’t new. GE has been touting its ability to take in data to predict failures for the last five or six years. Startups such as Augury also offer similar services, albeit by analyzing the sounds that machines make as opposed to their direct vibration. Really, the sense is that anyone with a fancy algorithm and access to data can come up with some way to predict the health of a given machine.
But Abhinav Khushraj, one of Petasense’s cofounders, begs to differ. He says that Petasense is different because fancy algorithms are one thing, but access to data is the essential thing. Petasense built its own vibration sensor so it could get clean data to populate its analytics efforts. Controlling the sensor gives Petasense the competitive edge, says Khushraj.
I want to believe this. I can see the value in having clean data and the ability to understand the specifics of the hardware collecting that data. However, I also know that new ways of getting data come along all the time with different incentives to use them. Petasense does make it incredibly easy to buy and deploy its vibration sensor, which goes a long way to assuaging my doubts about its customers finding a new source of vibration data.
The sensor costs between $ 400 and $ 600 and gets glued onto the equipment with industrial epoxy. The battery lasts two years and transmits data every three hours. If it’s as simple as getting someone to walk around sticking a sensor onto every piece of equipment, then that’s not a difficult ask. This assumes it’s easy to put the device on a corporate network. Because it uses Wi-Fi, things could get tricky.
Once the sensor is transmitting data, companies pay about $ 10 per month, per device, for the analytics. The whole service replaces what was typically one person, who would come around and collect vibration data from gear every month or so, and the specialist that person sent the data to, who would then use that reading to see if there was a problem.
Obviously the sensor replaces those two people, but it also collects a lot more information than was previously possible, which presumably leads to better results. Petasense has customers in the utilities industry and customers who use it to monitor HVAC equipment in buildings.
We’re an AI company, so people always ask about our algorithms. If we could get a dollar for every time we’re asked about which flavor of machine learning we use –convolutional neural nets, K-means, or whatever – we would never need another dollar of VC investment ever again.
But the truth is that algorithms are not the most important thing for building AI solutions — data is. Algorithms aren’t even #2. People in the trenches of machine learning know that once you have the data, It’s really all about “features.”
In machine learning parlance, features are the specific variables that are used as input to an algorithm. Features can be selections of raw values from input data, or can be values derived from that data. With the right features, almost any machine learning algorithm will find what you’re looking for. Without good features, none will. And that’s especially true for real world problems where data comes with lots of inherent noise and variation.
My colleague Jeff (the other Reality AI co-founder) likes to use this example: Suppose I’m trying to detect when my wife comes home. I’ll take a sensor, point it at the doorway and collect data. To use machine learning on that data, I’ll need to identify a set of features that help distinguish my wife from anything else that the sensor might see. What would be the best feature to use? One that indicates, “There she is!” It would be perfect — one bit with complete predictive power. The machine learning task would be rendered trivial.
If only we could figure out how to compute better features directly from the underlying data… Deep Learning accomplishes this trick with layers of convolutional neural nets, but that carries a great deal of computational overhead. There are other ways.
At Reality AI, where our tools create classifiers and detectors based on high sample rate signal inputs (accelerometry, vibration, sound, electrical signals, etc) that often have high levels of noise and natural variation, we focus on discovering features that deliver the greatest predictive power with the lowest computational overhead. Our tools follow a mathematical process for discovering optimized features from the data before worrying about the particulars of algorithms that will make decisions with those features. The closer our tools get to perfect features, the better end results become. We need less data, use less training time, are more accurate, and require less processing power. It’s a very powerful method.
For an example, let’s look at feature selection in high-sample rate (50Hz on up) IoT signal data, like vibration or sound. In the signal processing world, the engineer’s go-to for feature selection is usually frequency analysis. The usual approach to machine learning on this kind of data would be to take a signal input, run a Fast Fourier Transform (FFT) on it, and consider the peaks in those frequency coefficients as inputs for a neural network or some other algorithm.
Why this approach? Probably because it’s convenient, since all the tools these engineers use support it. Probably because they understand it, since everyone learns the FFT in engineering school. And probably because it’s easy to explain, since the results are easily relatable back to the underlying physics. But the FFT rarely provides an optimal feature set, and it often blurs important time information that could be extremely useful for classification or detection in the underlying signals.
Take for example this early test comparing our optimized features to the FFT on a moderately complex, noisy group of signals. In the first graph below we show a time-frequency plot of FFT results on this particular signal input (this type of plot is called a spectrogram). The vertical axis is frequency, and the horizontal axis is time, over which the FFT is repeatedly computed for a specified window on the streaming signal. The colors are a heat-map, with the warmer colors indicating more energy in that particular frequency range.
Compare that chart to one showing optimized features for this particular classification problem generated using our methods. On this plot you can see what is happening with much greater resolution, and the facts become much easier to visualize. Looking at this chart it’s crystal clear that the underlying signal consists of a multi-tone low background hum accompanied by a series of escalating chirps, with a couple of other transient things going on. The information is de-blurred, noise is suppressed, and you don’t need to be a signal processing engineer to understand that the detection problem has just been made a whole lot easier.
There’s another key benefit to optimizing features from the get go – the resulting classifier will be significantly more computationally efficient. Why is that important? It may not be if you have unlimited, free computing power at your disposal. But if you are looking to minimize processing charges, or are trying to embed your solution on the cheapest possible hardware target, it is critical. For embedded solutions, memory and clock cycles are likely to be your most precious resources, and spending time to get the features right is your best way to conserve them.
Deep Learning and Feature Discovery
Reality AI, we have our own methods for discovering optimized features in signal data, but ours are not the only way.
As mentioned above, Deep Learning (DL) also discovers features, though they are rarely optimized. Still, DL approaches have been very successful with certain kinds of problems using signal data, including object recognition in images and speech recognition in sound. It can be highly effective approach for a wide range of problems, but DL requires a great deal of training data, is not very computationally efficient, and can be difficult for a non-expert to use. There is often a sensitive dependence of classifier accuracy on a large number of configuration parameters, leading many of those who work with DL to focus heavily on tweaking previously used networks rather than focusing on finding the best features for each new problem. Learning happens “automatically”, so why worry about it?
My co-founder Jeff (the mathematician) explains that DL is basically “a generalized non-linear function mapping – cool mathematics, but with a ridiculously slow convergence rate compared to almost any other method.” Our approach, on the other hand, is tuned to signals but delivers much faster convergence with less data. On applications for which Realty AI is a good fit, this kind of approach will be orders of magnitude more efficient than DL.
The very public successes of Deep Learning in products like Apple’s Siri, the Amazon Echo, and the image tagging features available on Google and Facebook have led the community to over-focus a little on the algorithm side of things. There has been a tremendous amount of exciting innovation in ML algorithms in and around Deep Learning. But let’s not forget the fundamentals. It’s really all about the features.
Even when faced with evidence that an algorithm will deliver better results than human judgment, we consistently choose to follow our own minds.
MIT Sloan Management Review editor in chief Paul Michelman sat down with Berkeley Dietvorst, assistant professor of marketing at the University of Chicago Booth School of Business, to discuss a phenomenon Dietvorst has studied in great detail. (See “Related Research.”) What follows is an edited and condensed version of their conversation.
MIT Sloan Management Review: What prompted you to investigate people’s acceptance or lack thereof of algorithms in decision-making?
Dietvorst: When I was a Ph.D. student, some of my favorite papers were old works by [the late psychology scholar and behavioral decision research expert] Robyn Dawes showing that algorithms outperform human experts at making certain types of predictions. The algorithms that Dawes was using were very simple and oftentimes not even calibrated properly.
A lot of others followed up Dawes’s work and showed that algorithms beat humans in many domains — in fact, in most of the domains that have been tested. There’s all this empirical work showing algorithms are the best alternative, but people still aren’t using them.
So we have this disconnect between what the evidence says people should do and what people are doing, and no one was researching why.
What’s an example of these simple algorithms that were already proving to be superior?
Dietvorst: One of the areas was predicting student performance during an admission review. Dawes built a simple model: Take four or five variables — GPA, test scores, etc. — assign them equal weight, average them on a numerical scale, and use that result as your prediction of how students will rank against each other in actual performance. That model — which doesn’t even try to determine the relative value of the different variables — significantly outperforms admissions experts in predicting a student’s performance.
What were the experiments you conducted to try to get at the reasons we resist algorithms?
Dietvorst: We ran three sets of experiments.
For the first paper, we ran experiments where the participants’ job was to complete a forecasting task, and they were incentivized to perform well. The better they performed, the more money they would earn in each experiment. There were two stages: first a practice round — for both humans and algorithms — and then a stage where participants were paid based on the quality of their performance.
In the practice round, we manipulated what forecasts participants were exposed to. Some made their own forecasts and saw those of the algorithm. Some made only their own forecasts. Some saw only the algorithm’s results. Some saw neither. So each group had different information about how well each forecasting option had performed during the practice round.
For the second stage, participants could choose to forecast the results themselves or rely on the algorithm. The majority of participants who had not seen the algorithm’s results from the first round chose to use it in the second round. However, those people who had seen the algorithm’s results were significantly less likely to use it, even if it beat their own performance.
Once people had seen the algorithm perform and learned that it was imperfect, that it makes mistakes, they didn’t want to use it. But there wasn’t a similar effect for them. Once I made a forecast and learned that I was imperfect, I wasn’t less likely to use my own forecast. We saw that effect only for the algorithm.
And for the second experiment?
Dietvorst:In the second paper, we tried to address the problem: How can we get people to use algorithms once they know that they’re imperfect?
We began with the same basic question for participants: human or algorithm? In these experiments, however, there was an additional twist. Some participants were given the choice between using the algorithm as it existed or not at all. Other participants, if they chose to use the algorithm, could make some adjustments to it.
We found that people were substantially more willing to use algorithms when they could tweak them, even if just a tiny amount. People may be unwilling to use imperfect algorithms as they exist — even when the algorithm’s performance has been demonstrated superior to their own — but if you give the person any freedom to apply their own judgment through small adjustments, they’re much more willing.
So those are the key findings from the first two papers I wrote with my coauthors Joe Simmons and Cade Massey. Following on those, I have a solo paper where I’m investigating more about why people weren’t willing to use algorithms once they learned that they’re imperfect.
Most people in my experiment used human forecast by default, which positions the algorithm as an alternative. And the way they make the decision about whether or not to use the algorithm is by asking, “Will this algorithm meet my performance goal?” even if that goal is unrealistic for human forecasts, too. They don’t choose the algorithm if it won’t meet some lofty goal.
What they should more reasonably ask is, “Is this algorithm better than me?” — which it usually is. So people fail to ask the right question and end up holding the two options to different standards.
And to what do you attribute that?
Dietvorst: That’s an interesting question. I’m not sure how this decision process came about or why people are making the decision this way. And I’ve found it’s not actually unique to algorithms.
When choosing between two human forecasters, people do the same thing. If you assign them to have one forecaster as their default and you ask them how well would the other forecaster have to perform in order for you to switch, people say the other forecaster would have to meet my performance goals, just as with the algorithm.
It seems like people are naturally making what I would call the wrong comparison.
So it’s kind of a switching cost?
Dietvorst: Not necessarily. The way I would think about a switching cost would be I’m used to using human judgment, so an algorithm has to perform X percent better or X points better than me, or a human, for me to switch to it, right?
But that’s not really how it works. People are comparing the alternative to their performance goal, rather than comparing the two options. So, the higher the performance goal I give you, the better you need the algorithm to perform in order to switch to it, even though your own performance is staying constant.
So it doesn’t seem like a switching cost, at least as we tend to think of the term.
What I find so interesting is that it’s not limited to comparing human and algorithmic judgment; it’s my current method versus a new method, irrelevant of whether that new method is human or technology.
Dietvorst: Yes, absolutely. That’s exactly what I’ve been finding.
I think one of the questions that’s going to come up is, “Well, what do I do about this? Is simple recognition of the bias enough to counter it?”
Dietvorst: If I can convince someone that the right question to ask is, “Does this algorithm outperform what you’re currently using?” instead of, “Does this algorithm meet some lofty performance goal?” and that person buys in and says, “Yes, you’re right, I should use algorithms that outperform what I’m currently doing,” then, yes, that would work. I don’t know how easy or hard it would be to get people to buy into that, though.
And in a larger organization, thousands of decisions are being made every day. Without this bias being known, there really isn’t an obvious corrective measure, is there?
Dietvorst: The studies I’ve done suggest a couple restrictions that could reduce the bias.
People are deciding whether or not to use the algorithm by comparing it to the performance goal that they have. If you incentivize people to attempt to deliver performance much better than an algorithm has shown it’s capable of, it’s not so surprising that they ditch the algorithm to chase down that incentive with human judgment — even if it’s unrealistic they will achieve it.
If you lower their performance goal, the algorithm will be compared more favorably and people may be more likely to use it.
So the problem exists in situations where the goal itself is unreasonable.
Dietvorst: Yes, if you have some forecasting goal that is very hard to achieve and an algorithm hasn’t achieved it in the past, then you could see how it would make sense, in a certain way, for people not to use the algorithm. They’re pretty sure it’s not going to achieve the goal. So they use human judgment and end up performing even worse than the algorithm.
Presumably, we’re in an age now where the quality of algorithms is increasing — perhaps dramatically. I’m wondering whether this phenomenon will make our biases more or less pronounced. On the one hand, you could see the quality of algorithms catching up to people’s reference points. But the inverse of that is the reference point will continue to move at a speed as high if not higher than the ability of the algorithm.
Dietvorst: I agree: That could go either way. But I would like to push back a little bit on this idea that algorithms are really great. The literature shows that on average, when predicting human behavior, algorithms are about 10% to 15% better than humans. But humans are very bad at it. Algorithms are significantly better but nowhere near perfection. In many domains, I don’t see any way that they’re going to get close to perfection very soon.
There is a lot of uncertainty in the world that can’t be resolved or reduced — that is unknowable. Like when you roll a die you don’t know what number is going to come up until it happens. A lot of that type of aleatory uncertainty is determining outcomes in the real world. Algorithms can’t explain that.
Suppose Google Maps is telling you the fastest route to a new place. It can’t predict if there’s going to be a giant accident right in front of you when you’re halfway there. And so, as long as there’s random error and there’s aleatory uncertainty that factors into a lot of these outcomes — which it does to a larger extent than people recognize — algorithms aren’t going to be perfect, and they aren’t really even going to be close to perfect. They’ll just be better than humans.
So what’s next? Is this an ongoing field of study for you?
Dietvorst: Absolutely. There’s a lot more to understand about how people think algorithms operate; what they think are the differences between algorithms and humans; and how that affects their use of algorithms. There’s still really interesting research to be done.
Every day, we hear about smart machines with new capabilities: computers that can outplay chess masters or are capable of processing natural language to answer increasingly complex questions; new cars that alert us when the driver in front of us hits the brakes, when we drift out of our designated lanes, or when a pedestrian suddenly steps off the curb. But how soon will it be before smart machines perform complex, multifaceted services such as looking out for our health?
In a recent article in The New Yorker, “A.I. Versus M.D.,” Siddhartha Mukherjee, a hematologist and oncologist at Columbia University Medical Center, describes the increasingly nuanced role computers are playing in cancer screening. Twenty years ago, Mukherjee notes, computers were used by diagnosticians to help identify suspicious patterns or waveforms and, later, to confirm a hypothesis. However, he writes, while the rate of biopsies increased, detections didn’t, and there was a jump in false positives.
More recent intelligent systems have used a computing strategy modeled after the brain, known as a “neural network,” which can “learn” how to diagnose illnesses. Mukherjee describes a 2015 study by Sebastian Thrun of Stanford University in which a smart machine was asked to classify 14,000 images that dermatologists had found to have abnormalities (either benign or cancerous). The system correctly diagnosed the problems 72% of the time, compared with 66% for two board-certified dermatologists. Then, in a related study, 21 dermatologists were asked to review a set of about 2,000 images for skin cancers. In all but a few cases, the machine did better at spotting cases of melanoma than the doctors did; what’s more, for reasons that aren’t clear, it learned to differentiate moles from cancers.
Applying similar capabilities to detect other illness early and accurately may not be far away. By monitoring a person’s speech patterns with a cellphone, for example, it may be possible to detect early signs of Alzheimer’s disease. Steering wheels with sensors to detect hesitations and tremors might identify potential cases of Parkinson’s disease. Similarly, researchers say, algorithms tracking patients’ heartbeats may identify cardiac issues before they show up in other ways. Patients concerned about skin lesions will be able to send images from their iPhones to robots, which over time will become more and more skilled at diagnosis.
So, what will this mean for specialists such as dermatologists or radiologists? At a recent conference on AI and machine learning at MIT’s Initiative on the Digital Economy in Cambridge, Massachusetts, 69% of attendees said they expected most medical images to be interpreted “primarily by machines” by 2020, with more than 95% expecting this to occur by 2030. Some 14% of the attendees said they expected most surgeries to be performed by machines by 2020, with 54% saying it would happen by 2030. (See “Expectations for Smart Machines in Medicine.”)
Mukherjee doesn’t think skilled medical specialists are at risk of being replaced by smart machines. For one thing, doctors (at least those with a good “bedside manner”) can provide a degree of explanation and interaction that algorithms will never be capable of offering. In the near term, at least, machines are likely to augment human capabilities. Yet the increasing capacity of learning machines poses a number of questions that apply not only to medicine but other fields as well. “As machines learn more and more,” Mukherjee wonders, “will humans learn less and less?” And who will train the technicians?
I write a lot about optimizing manufacturing, supply chains, and even employees using sensors and data analytics. While we are nowhere near an ultra-efficient society where every business process or even road trip is optimized, that is the ultimate goal.
As companies start to invest in technology to streamline their operations, I’m starting to question what the societal implications will be. So far, people are mostly thinking about this only as robots taking our jobs. But here’s the dirty secret of optimization: you optimize for just one specific goal.
For many businesses that goal is profit. This will eventually lead to the stark realization that optimizing for profits through automation will put lie to any other corporate mission statement about serving customers or protecting the environment or whatever else doesn’t directly affect the bottom line.
As we move into a digital world where decisions are increasingly made by machines, it becomes clear that we have to understand exactly what the purpose of every business process is. It also means that we have to find some way of factoring into our algorithms other values and elements such as work-life balance or protecting the earth.
Humans can hold two different and competing thoughts in their heads. For example, the highest goal of a business is to make money and then a secondary goal is to serve its customers. But computers, as we program them to optimize for specific outcomes, can’t handle that dichotomy. As every shred of waste is squeezed from a system from a careful analysis of data and automation of operations, the ability to say two diverging things hold true will end.
I’m not saying that making money is evil, just that when we optimize that over protecting the environment or treating workers well, it can lead to problems. Right now, the humans in charge, be it executives or managers, step in to ensure at least some externalities matter.
Yet, I’m still stunned by the lengths that some companies will go to in order to push profits over people. A recent example of this can be found in the algorithmically precise software that scheduled workers at places like Ann Taylor and Victoria’s Secret on an as-needed basis.
While this practice cuts the waste of having an employee standing around doing nothing during a lull, it wrecks havoc on the employee’s lives, making it impossible to schedule childcare or take a second job.
It doesn’t have to be scheduling software. It could be algorithmically determined quotas in a warehouse job that penalize employees for an off day. Examples are all around us, generally coming from companies trying to reduce their operational costs or those that aren’t shy about alienating workers.
In the future, absent consideration, more companies will face this dilemma. At that point, they will have to make a concerted effort to factor in human or other concerns into their algorithms, which may shave a bit off their profits. The other alternative is to have regulations in place that put a cost on things that matter to us as a society.