Research Reveals Possibly Fatal Consequences of Algorithmic Bias
Self-driving cars are supposed to make driving safer, but they may endanger the lives of certain groups. New Georgia Tech research suggests that pedestrians with darker skin may be more likely to get hit by self-driving cars than those with lighter skin.
The researchers tested machine learning (ML) object detection models to see how well they could see people with different skin tones. Their results revealed models were nearly 5 percent less likely to detect darker-skinned pedestrians.
This predictive imbalance remained regardless of how researchers accounted for variables in the training data set, such as time of day, partially blocked views of pedestrians, and pixel size of the person.
“Companies don’t want the public to know about any issues of inaccuracy, so consumers need to learn to ask a lot of questions,” said Jamie Morgenstern, School of Computer Science (SCS) assistant professor and the study’s lead author.[RELATED: 'Human Rights' May Help Shape Artificial Intelligence in 2019]
The prediction system is only one possible source of the inequity. The training data is another. The researchers used one of the most comprehensive publicly available self-driving car training datasets and wanted to determine if it represented all skin tones evenly. They classified the images using Fitzpatrick skin typing, a scale to predict UV sensitivity, and found the dataset has roughly 3.5 times as many examples of people with lighter skin.
This discrepancy might introduce problems because of the ML method known as loss function, which determines how well an algorithm can model a dataset. A model learns by measuring loss function between predicted values and actual values. The goal is to get as small of a loss function output as possible, indicating the model fits the data well. This approach is more accurate with larger subsets in the data, but can minimize the value of smaller groups. In effect, this 3.5 difference made the results even more accurate for lighter-skinned pedestrians.
Despite the bias, Morgenstern remains optimistic. The team was able to correct for the inequity by reweighing the model to better analyze smaller groups.
The findings, published earlier this month, have attracted media coverage and some criticism. Much of this stems from the fact that Morgenstern and her fellow researchers – School of Interactive Computing Assistant Professor Judy Hoffman and machine learning Ph.D. student Benjamin Wilson — were not able to investigate ML models and training data actually used by the self-driving car industry because they are not publicly available.
A bigger problem
This is not the first study of ML systems having varying predictive accuracy on different demographics. Other researchers have found examples in the financial sector. Yet in many of these scenarios, developers won’t take responsibility, according to Morgenstern.
“Developers blame any biased outcomes of their system on biased historical trends, such as the fact that more loans were applied for and issued in whiter neighborhoods, or biased training data,” she said. “For example, if the training labels used for creditworthiness instead reflect only the decisions of lenders who are now known to have had higher predictive accuracy on white applicants.”
With self-driving cars, however, a system developer would have a harder time blaming object detection system bias on historical trends or behavior of certain demographic groups. This was what appealed to Morgenstern about this research.
“There is no capacity for arguing that historical behavior of some group should affect the trade-offs made by self-driving cars,” Morgenstern said. “No one deserves to be hit by a car.”
As we step into 2024 and reflect on the previous year, 2023 was a huge year for news stories here at @GTcomputing . Dive into the 184 published news stories of 2023 and see if theres anything you missed! https://t.co/zUHBPiiEwp
— Georgia Tech Computing (@gtcomputing) January 11, 2024
The College of Computing is proud to celebrate Black History Month this February and honor those who pave the way for equality within our community. pic.twitter.com/Rn5BRskogI
— Georgia Tech Computing (@gtcomputing) February 1, 2024