Comparison and benchmark of name-to-gender inference services | Zendy

L. Santamaría | Zendy; Helena Mihaljević | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Comparison and benchmark of name-to-gender inference services

Author(s) -

L. Santamaría,

Helena Mihaljević

Publication year - 2018

Publication title -

peerj computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.806

H-Index - 24

ISSN - 2376-5992

DOI - 10.7717/peerj-cs.156

Subject(s) - inference , benchmark (surveying) , metric (unit) , computer science , set (abstract data type) , natural language processing , information retrieval , artificial intelligence , data mining , machine learning , data science , operations management , geodesy , economics , programming language , geography

The increased interest in analyzing and explaining gender inequalities in tech, media, and academia highlights the need for accurate inference methods to predict a person’s gender from their name. Several such services exist that provide access to large databases of names, often enriched with information from social media profiles, culture-specific rules, and insights from sociolinguistics. We compare and benchmark five name-to-gender inference services by applying them to the classification of a test data set consisting of 7,076 manually labeled names. The compiled names are analyzed and characterized according to their geographical and cultural origin. We define a series of performance metrics to quantify various types of classification errors, and define a parameter tuning procedure to search for optimal values of the services’ free parameters. Finally, we perform benchmarks of all services under study regarding several scenarios where a particular metric is to be optimized.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research