z-logo
open-access-imgOpen Access
A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods
Author(s) -
Daniel Deutsch,
Rotem Dror,
Dan Roth
Publication year - 2021
Publication title -
transactions of the association for computational linguistics
Language(s) - English
Resource type - Journals
ISSN - 2307-387X
DOI - 10.1162/tacl_a_00417
Subject(s) - automatic summarization , resampling , bootstrapping (finance) , computer science , metric (unit) , correlation , data mining , reliability (semiconductor) , machine learning , confidence interval , permutation (music) , artificial intelligence , set (abstract data type) , statistics , econometrics , mathematics , power (physics) , operations management , physics , geometry , quantum mechanics , acoustics , economics , programming language
The quality of a summarization evaluation metric is quantified by calculating the correlation between its scores and human annotations across a large number of summaries. Currently, it is unclear how precise these correlation estimates are, nor whether differences between two metrics’ correlations reflect a true difference or if it is due to mere chance. In this work, we address these two problems by proposing methods for calculating confidence intervals and running hypothesis tests for correlations using two resampling methods, bootstrapping and permutation. After evaluating which of the proposed methods is most appropriate for summarization through two simulation experiments, we analyze the results of applying these methods to several different automatic evaluation metrics across three sets of human annotations. We find that the confidence intervals are rather wide, demonstrating high uncertainty in the reliability of automatic metrics. Further, although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do so in some evaluation settings.1

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom