z-logo
open-access-imgOpen Access
Comparison and benchmark of name-to-gender inference services
Author(s) -
L. Santamaría,
Helena Mihaljević
Publication year - 2018
Publication title -
peerj computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.806
H-Index - 24
ISSN - 2376-5992
DOI - 10.7717/peerj-cs.156
Subject(s) - inference , benchmark (surveying) , metric (unit) , computer science , set (abstract data type) , natural language processing , information retrieval , artificial intelligence , data mining , machine learning , data science , operations management , geodesy , economics , programming language , geography
The increased interest in analyzing and explaining gender inequalities in tech, media, and academia highlights the need for accurate inference methods to predict a person’s gender from their name. Several such services exist that provide access to large databases of names, often enriched with information from social media profiles, culture-specific rules, and insights from sociolinguistics. We compare and benchmark five name-to-gender inference services by applying them to the classification of a test data set consisting of 7,076 manually labeled names. The compiled names are analyzed and characterized according to their geographical and cultural origin. We define a series of performance metrics to quantify various types of classification errors, and define a parameter tuning procedure to search for optimal values of the services’ free parameters. Finally, we perform benchmarks of all services under study regarding several scenarios where a particular metric is to be optimized.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom