Evaluating Human Pairwise Preference Judgments | Zendy

Mark Dras | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Evaluating Human Pairwise Preference Judgments

Author(s) -

Mark Dras

Publication year - 2015

Publication title -

computational linguistics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.314

H-Index - 98

eISSN - 1530-9312

pISSN - 0891-2017

DOI - 10.1162/coli_a_00222

Subject(s) - bespoke , computer science , preference , pairwise comparison , context (archaeology) , artificial intelligence , natural language processing , sample (material) , machine learning , statistics , mathematics , paleontology , chemistry , chromatography , political science , law , biology

Human evaluation plays an important role in NLP, often in the form of preference judgments. Although there has been some use of classical non-parametric and bespoke approaches to evaluating these sorts of judgments, there is an entire body of work on this in the context of sensory discrimination testing and the human judgments that are central to it, backed by rigorous statistical theory and freely available software, that NLP can draw on. We investigate one approach, Log-Linear Bradley-Terry models, and apply it to sample NLP data.9 page(s

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research