Research Reveals Possibly Fatal Consequences of Algorithmic Bias
Self-driving cars are supposed to make driving safer, but they may endanger the lives of certain groups. New Georgia Tech research suggests that pedestrians with darker skin may be more likely to get hit by self-driving cars than those with lighter skin.
The researchers tested machine learning (ML) object detection models to see how well they could see people with different skin tones. Their results revealed models were nearly 5 percent less likely to detect darker-skinned pedestrians.
This predictive imbalance remained regardless of how researchers accounted for variables in the training data set, such as time of day, partially blocked views of pedestrians, and pixel size of the person.
“Companies don’t want the public to know about any issues of inaccuracy, so consumers need to learn to ask a lot of questions,” said Jamie Morgenstern, School of Computer Science (SCS) assistant professor and the study’s lead author.
The prediction system is only one possible source of the inequity. The training data is another. The researchers used one of the most comprehensive publicly available self-driving car training datasets and wanted to determine if it represented all skin tones evenly. They classified the images using Fitzpatrick skin typing, a scale to predict UV sensitivity, and found the dataset has roughly 3.5 times as many examples of people with lighter skin.
This discrepancy might introduce problems because of the ML method known as loss function, which determines how well an algorithm can model a dataset. A model learns by measuring loss function between predicted values and actual values. The goal is to get as small of a loss function output as possible, indicating the model fits the data well. This approach is more accurate with larger subsets in the data, but can minimize the value of smaller groups. In effect, this 3.5 difference made the results even more accurate for lighter-skinned pedestrians.
Despite the bias, Morgenstern remains optimistic. The team was able to correct for the inequity by reweighing the model to better analyze smaller groups.
The findings, published earlier this month, have attracted media coverage and some criticism. Much of this stems from the fact that Morgenstern and her fellow researchers – School of Interactive Computing Assistant Professor Judy Hoffman and machine learning Ph.D. student Benjamin Wilson — were not able to investigate ML models and training data actually used by the self-driving car industry because they are not publicly available.
[RELATED: Georgia Tech Researchers Improve Fairness in the Machine Learning Pipeline]
A bigger problem
This is not the first study of ML systems having varying predictive accuracy on different demographics. Other researchers have found examples in the financial sector. Yet in many of these scenarios, developers won’t take responsibility, according to Morgenstern.
“Developers blame any biased outcomes of their system on biased historical trends, such as the fact that more loans were applied for and issued in whiter neighborhoods, or biased training data,” she said. “For example, if the training labels used for creditworthiness instead reflect only the decisions of lenders who are now known to have had higher predictive accuracy on white applicants.”
With self-driving cars, however, a system developer would have a harder time blaming object detection system bias on historical trends or behavior of certain demographic groups. This was what appealed to Morgenstern about this research.
“There is no capacity for arguing that historical behavior of some group should affect the trade-offs made by self-driving cars,” Morgenstern said. “No one deserves to be hit by a car.”