Artificial intelligence (AI) has the potential to play a role in predictive medicine, from prevention and diagnosis to treatment. Machine learning models have proved useful in certain types of leukemia and deep learning in diabetic retinopathy. However, contrary to expectations of human bias removal, evidence has shown an increased bias, and hence unfairness, against specific subpopulations.
The problem arises because AI programs learn from data and they will simply learn differently depending on the datasets physicians or researchers employ to train them.
A study published in Science (open access) this week investigates the bias in AI models used to predict cognitive, behavioral, and psychiatric patterns that may characterize a disorder. Jingwei Li and collaborators examined whether white Americans and African Americans enjoyed similar predictive performance when the AI models were trained with state-of-the-art large-scale datasets containing neuroimaging and behavioral data.
Jingwei Li and collaborators mentioned that major lines of research in neurosciences use datasets that mix multiple ethnicities/races—usually dominated by participants with “European ancestry and/or white ethnic/racial background”—paying little attention to race fairness.
This novel study used two large independent datasets of resting-state functional magnetic resonance imaging (fMRI) concentrated in the U.S. population. The study’s researchers consider that these datasets minimize cultural or socioeconomic differences such as food preferences or school and health care systems.
How does functional magnetic resonance imaging work?
Functional magnetic resonance imaging (fMRI) is used in cognitive neurosciences, clinical psychiatry, and presurgical planning. It is popular because it is a non-invasive technique while providing high-resolution images with good contrast in different tissues. fMRI is designed to detect time-varying changes in brain metabolism. However, it does not detect neural activity directly; instead, it is sensitive to the blood flow alterations that follow changes in neuronal activity.
So, how does it work? When a molecule is placed in a strong magnetic field, the nuclei of some of its atoms start to behave like tiny magnets. Different molecules react to different magnetic fields. Nuclear magnetic resonance imaging machines will detect whether the hemoglobin in your blood is rich in oxygen or not. It looks for rich or poor oxygen because these have different magnetic properties.
But while the physics is clear and the technique is reliable, the fMRI does not provide diagnosis itself; it is still the physician who interprets the results and issues the diagnosis. In the neuroscientist community, the expectation is that AI will help physicians enhance precision in detecting cognitive and psychometric patterns, hopefully facilitating better diagnosis and treatment of mental disorders.
Unfairness in prediction of mental disorders
Jingwei Li and collaborators used the Human Connectome Project dataset containing 948 individuals between 22 and 37 years old, and the Adolescent Brain Cognitive Development dataset containing 5351 individuals between 9 and 11 years old. Individuals in both datasets are from the U.S. and consist of several ethnicities, although heavily dominated by white Americans. The proportions are shown in figure 1.
They investigate predictions in several cognitive and behavioral issues from visual episodic memory and sustained attention to grip strength, anger, and perceived rejection
When the machine learning models were trained with the full datasets or with samples dominated by white Americans, the prediction errors were higher for African Americans than for white Americans. In contrast, the same models trained only with African Americans increased their performance for this population. However, it resulted in a decreased performance when compared with the model trained only with white Americans.
Their results suggest that more neuroimaging and behavioral data must be collected across a greater range of ethnic and racial groups. “Overall, the results point to the need for caution and further research regarding the application of current brain-behavior prediction models in minority populations.” they wrote in the scientific paper.
So yes, AI models may be biased to gender, race, ethnic group, and more. Precision and prevention of discrimination against minorities, using AI, still need significant input.
Li, J., Bzdok, D., Chen, J., Tam, A., Ooi, L. Q. R., Holmes, A. J., Ge, T., Patil, K. R., Jabbi, M., Eickhoff, S. B., Yeo, B. T. T., & Genon, S. (2022). Cross-ethnicity/race generalization failure of behavioral prediction from resting-state functional connectivity. Science Advances, 8(11). https://doi.org/10.1126/sciadv.abj1812