
Improving predictive models for Alzheimer’s disease using GWAS data by incorporating misclassified samples modeling
Author(s) -
Brissa-Lizbeth Romero-Rosales,
Jose-Gerardo Tamez-Peña,
Humberto Nicolini,
Maria-Guadalupe Moreno-Treviño,
Víctor Treviño
Publication year - 2020
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0232103
Subject(s) - genome wide association study , lasso (programming language) , computer science , identification (biology) , predictive modelling , disease , machine learning , artificial intelligence , genetic association , computational biology , bioinformatics , medicine , biology , genetics , genotype , pathology , single nucleotide polymorphism , botany , world wide web , gene
Late-onset Alzheimer’s Disease (LOAD) is the most common form of dementia in the elderly. Genome-wide association studies (GWAS) for LOAD have open new avenues to identify genetic causes and to provide diagnostic tools for early detection. Although several predictive models have been proposed using the few detected GWAS markers, there is still a need for improvement and identification of potential markers. Commonly, polygenic risk scores are being used for prediction. Nevertheless, other methods to generate predictive models have been suggested. In this research, we compared three machine learning methods that have been proved to construct powerful predictive models (genetic algorithms, LASSO, and step-wise) and propose the inclusion of markers from misclassified samples to improve overall prediction accuracy. Our results show that the addition of markers from an initial model plus the markers of the model fitted to misclassified samples improves the area under the receiving operative curve by around 5%, reaching ~0.84, which is highly competitive using only genetic information. The computational strategy used here can help to devise better methods to improve classification models for AD. Our results could have a positive impact on the early diagnosis of Alzheimer’s disease.