Learning statistical models of phenotypes using noisy labeled training data
Author(s) -
Vibhu Agarwal,
Tanya Podchiyska,
Juan M. Banda,
Veena Goel,
Tiffany I. Leung,
Evan Minty,
Timothy E. Sweeney,
Elsie Gyang,
Nigam H. Shah
Publication year - 2016
Publication title -
journal of the american medical informatics association
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.614
H-Index - 150
eISSN - 1527-974X
pISSN - 1067-5027
DOI - 10.1093/jamia/ocw028
Subject(s) - computer science , scalability , machine learning , artificial intelligence , logistic regression , feature (linguistics) , phenotype , feature engineering , implementation , predictive modelling , data mining , deep learning , database , linguistics , philosophy , biochemistry , chemistry , gene , programming language
Traditionally, patient groups with a phenotype are selected through rule-based definitions whose creation and validation are time-consuming. Machine learning approaches to electronic phenotyping are limited by the paucity of labeled training datasets. We demonstrate the feasibility of utilizing semi-automatically labeled training sets to create phenotype models via machine learning, using a comprehensive representation of the patient medical record.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom