The risk of racial bias while tracking influenza-related content on social media using machine learning | Zendy

Brandon Lwowski | Zendy; Anthony Rios | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

The risk of racial bias while tracking influenza-related content on social media using machine learning

Author(s) -

Brandon Lwowski,

Anthony Rios

Publication year - 2020

Publication title -

journal of the american medical informatics association

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.614

H-Index - 150

eISSN - 1527-974X

pISSN - 1067-5027

DOI - 10.1093/jamia/ocaa326

Subject(s) - machine learning , computer science , artificial intelligence , support vector machine , undersampling , extreme learning machine , social media , artificial neural network , task (project management) , convolutional neural network , multi task learning , natural language processing , world wide web , management , economics

OBJECTIVEMachine learning is used to understand and track influenza-related content on social media. Because these systems are used at scale, they have the potential to adversely impact the people they are built to help. In this study, we explore the biases of different machine learning methods for the specific task of detecting influenza-related content. We compare the performance of each model on tweets written in Standard American English (SAE) vs African American English (AAE).MATERIALS AND METHODSTwo influenza-related datasets are used to train 3 text classification models (support vector machine, convolutional neural network, bidirectional long short-term memory) with different feature sets. The datasets match real-world scenarios in which there is a large imbalance between SAE and AAE examples. The number of AAE examples for each class ranges from 2% to 5% in both datasets. We also evaluate each model's performance using a balanced dataset via undersampling.RESULTSWe find that all of the tested machine learning methods are biased on both datasets. The difference in false positive rates between SAE and AAE examples ranges from 0.01 to 0.35. The difference in the false negative rates ranges from 0.01 to 0.23. We also find that the neural network methods generally has more unfair results than the linear support vector machine on the chosen datasets.CONCLUSIONSThe models that result in the most unfair predictions may vary from dataset to dataset. Practitioners should be aware of the potential harms related to applying machine learning to health-related social media data. At a minimum, we recommend evaluating fairness along with traditional evaluation metrics.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research