Do Neural Information Extraction Algorithms Generalize Across Institutions?
Author(s) -
Enrico Santus,
Clara Li,
Adam Yala,
Donald J. Peck,
Rufina Soomro,
Naveen Faridi,
Isra Mamshad,
Rong Tang,
Conor R. Lanahan,
Regina Barzilay,
Kevin S. Hughes
Publication year - 2019
Publication title -
jco clinical cancer informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.188
H-Index - 12
ISSN - 2473-4276
DOI - 10.1200/cci.18.00160
Subject(s) - generalizability theory , computer science , artificial intelligence , machine learning , convolutional neural network , artificial neural network , generalization , data set , set (abstract data type) , natural language processing , mathematics , statistics , mathematical analysis , programming language
Natural language processing (NLP) techniques have been adopted to reduce the curation costs of electronic health records. However, studies have questioned whether such techniques can be applied to data from previously unseen institutions. We investigated the performance of a common neural NLP algorithm on data from both known and heldout (ie, institutions whose data were withheld from the training set and only used for testing) hospitals. We also explored how diversity in the training data affects the system's generalization ability.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom