Findings highlight the need to ensure AI tools used in mental healthcare are fair and equitable


A first-of-its-kind study led by researchers at the Centre for Addiction and Mental Health (CAMH) has found that artificial intelligence (AI) models used to predict aggressive incidents in acute psychiatric care can reinforce and amplify existing social and structural inequities by overestimating the likelihood of aggression among already marginalized groups. The findings, which were recently published in npj Mental Health Research, underscore the importance of careful evaluation to ensure AI tools don’t perpetuate harm in clinical settings but promote more equitable care.
“While fairness of clinical AI tools has been evaluated in other areas, this study highlights a critical gap in mental healthcare considering assessments, which are used to train AI models, are often based on subjective observations that are shaped by underlying social and structural biases,” says Dr. Marta Maslej, Staff Scientist at the Krembil Centre for Neuroinformatics (KCNI) and senior co-author of the study. “If fairness is not built in, the clinical use of AI models can lead to significant distress, loss of trust, and even precipitate aggressive incidents that would have otherwise not occurred. There is a clear need to develop AI applications that centre and promote equity.”
Findings highlight the importance of fairness analysis
Several healthcare systems in the Netherlands, Switzerland, China, the US, and Canada have assessed or are considering the use of AI models to predict aggressive or violent behaviour to enable earlier intervention and targeted de-escalation. However, little research has examined whether these tools perform equitably across patient populations—particularly in psychiatry, where social and structural factors strongly shape care experiences.
To address this gap, the research team trained a machine learning (a form of AI) model on electronic health records from more than 17,000 CAMH inpatients and examined how prediction errors varied across intersecting social and demographic factors, including race, gender, and social context. The model showed clear bias with higher false positive rates reported for Black and Middle Eastern individuals, men, patients admitted to emergency care by police, and those with unstable or supportive forms of housing. These findings suggest that the model may disproportionately flag already over-surveilled or structurally disadvantaged groups as high risk, potentially shaping clinical decisions in ways that compound inequities.
