FeTaQA: Free-form Table Question Answering
Author(s) -
Linyong Nan,
Chia-Chun Hsieh,
Ziming Mao,
Xi Lin,
Neha Verma,
Rui Zhang,
Wojciech Kryściński,
Hailey Schoelkopf,
Riley Kong,
Xiangru Tang,
Mutethia Mutuma,
Ben Rosand,
Isabel Trindade,
Renusree Bandaru,
Jacob Cunningham,
Caiming Xiong,
Dragomir Radev
Publication year - 2022
Publication title -
transactions of the association for computational linguistics
Language(s) - English
Resource type - Journals
ISSN - 2307-387X
DOI - 10.1162/tacl_a_00446
Subject(s) - computer science , table (database) , question answering , pipeline (software) , task (project management) , information retrieval , benchmark (surveying) , parsing , artificial intelligence , natural language processing , data mining , programming language , management , geodesy , economics , geography
Existing table question answering datasets contain abundant factual questions that primarily evaluate a QA system’s comprehension of query and tabular data. However, restricted by their short-form answers, these datasets fail to include question–answer interactions that represent more advanced and naturally occurring information needs: questions that ask for reasoning and integration of information pieces retrieved from a structured knowledge source. To complement the existing datasets and to reveal the challenging nature of the table-based question answering task, we introduce FeTaQA, a new dataset with 10K Wikipedia-based {table, question, free-form answer, supporting table cells} pairs. FeTaQA is collected from noteworthy descriptions of Wikipedia tables that contain information people tend to seek; generation of these descriptions requires advanced processing that humans perform on a daily basis: Understand the question and table, retrieve, integrate, infer, and conduct text planning and surface realization to generate an answer. We provide two benchmark methods for the proposed task: a pipeline method based on semantic parsing-based QA systems and an end-to-end method based on large pretrained text generation models, and show that FeTaQA poses a challenge for both methods.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom