Open Access
Diversity in sociotechnical machine learning systems
Author(s) -
Sina Fazelpour,
Maria DeArteaga
Publication year - 2022
Publication title -
big data and society
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.244
H-Index - 37
ISSN - 2053-9517
DOI - 10.1177/20539517221082027
Subject(s) - diversity (politics) , sociotechnical system , context (archaeology) , epistemology , computer science , sociocultural evolution , knowledge management , sociology , management science , data science , artificial intelligence , engineering , paleontology , philosophy , anthropology , biology
There has been a surge of recent interest in sociocultural diversity in machine learning research. Currently, however, there is a gap between discussions of measures and benefits of diversity in machine learning, on the one hand, and the broader research on the underlying concepts of diversity and the precise mechanisms of its functional benefits, on the other. This gap is problematic because diversity is not a monolithic concept. Rather, different concepts of diversity are based on distinct rationales that should inform how we measure diversity in a given context. Similarly, the lack of specificity about the precise mechanisms underpinning diversity’s potential benefits can result in uninformative generalities, invalid experimental designs, and illicit interpretations of findings. In this work, we draw on research in philosophy, psychology, and social and organizational sciences to make three contributions: First, we introduce a taxonomy of different diversity concepts from philosophy of science, and explicate the distinct epistemic and political rationales underlying these concepts. Second, we provide an overview of mechanisms by which diversity can benefit group performance. Third, we situate these taxonomies of concepts and mechanisms in the lifecycle of sociotechnical machine learning systems and make a case for their usefulness in fair and accountable machine learning. We do so by illustrating how they clarify the discourse around diversity in the context of machine learning systems, promote the formulation of more precise research questions about diversity’s impact, and provide conceptual tools to further advance research and practice.