z-logo
Premium
Five sources of bias in natural language processing
Author(s) -
Hovy Dirk,
Prabhumoye Shrimai
Publication year - 2021
Publication title -
language and linguistics compass
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.619
H-Index - 44
ISSN - 1749-818X
DOI - 10.1111/lnc3.12432
Subject(s) - computer science , context (archaeology) , natural language processing , process (computing) , natural (archaeology) , work (physics) , natural language , gender bias , annotation , simple (philosophy) , linguistics , artificial intelligence , data science , psychology , epistemology , history , social psychology , mechanical engineering , philosophy , archaeology , engineering , operating system
Recently, there has been an increased interest in demographically grounded bias in natural language processing (NLP) applications. Much of the recent work has focused on describing bias and providing an overview of bias in a larger context. Here, we provide a simple, actionable summary of this recent work. We outline five sources where bias can occur in NLP systems: (1) the data, (2) the annotation process, (3) the input representations, (4) the models, and finally (5) the research design (or how we conceptualize our research). We explore each of the bias sources in detail in this article, including examples and links to related work, as well as potential counter‐measures.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here