deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets | Zendy

Michel Galley | Zendy; Chris Brockett | Zendy; Alessandro Sordoni | Zendy; Yangfeng Ji | Zendy; Michael Auli | Zendy; Chris Quirk | Zendy; Margaret Mitchell | Zendy; Jianfeng Gao | Zendy; Bill Dolan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

Author(s) -

Michel Galley,

Chris Brockett,

Alessandro Sordoni,

Yangfeng Ji,

Michael Auli,

Chris Quirk,

Margaret Mitchell,

Jianfeng Gao,

Bill Dolan

Publication year - 2015

Language(s) - English

Resource type - Conference proceedings

DOI - 10.3115/v1/p15-2073

Subject(s) - metric (unit) , discriminative model , volume (thermodynamics) , computational linguistics , association (psychology) , computer science , cognitive science , artificial intelligence , natural language processing , library science , linguistics , philosophy , engineering , physics , epistemology , psychology , operations management , quantum mechanics

We introduce Discriminative BLEU (∆BLEU), a novel metric for intrinsic evaluation of generated text in tasks that admit a diverse range of possible outputs. Reference strings are scored for quality by human raters on a scale of [−1, +1] to weight multi-reference BLEU. In tasks involving generation of conversational responses, ∆BLEU correlates reasonably with human judgments and outperforms sentence-level and IBM BLEU in terms of both Spearman’s ρ and Kendall’s τ .

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research