z-logo
Premium
Identification of genomic biomarkers with machine learning for early and differential diagnosis of late‐onset Alzheimer’s disease (LOAD)
Author(s) -
Erdoğan Onur,
Esme Mert,
Balci Cafer,
Rafatov Sevda,
Cankurtaran Mustafa,
Yavuz Burcu Balam,
İyigün Cem,
Son Yeşim Aydın
Publication year - 2020
Publication title -
alzheimer's and dementia
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.713
H-Index - 118
eISSN - 1552-5279
pISSN - 1552-5260
DOI - 10.1002/alz.042558
Subject(s) - genome wide association study , computational biology , random forest , identification (biology) , univariate , gene , biology , genetics , computer science , multivariate statistics , artificial intelligence , machine learning , single nucleotide polymorphism , botany , genotype
Background The complex genetic etiology of the LOAD is still unclear, which restrains the early and/or differential diagnosis of LOAD. Genome‐Wide Association Studies (GWAS) is designed to explore the statistical interactions of variants, but interactions between variants are overseen by the univariate analysis. The machine learning algorithms are able to capture hidden patterns for the understanding of complex genetic disorders. Methods The controlled accessed GWAS datasets provided by ADNI (210 controls and 344 cases), and GenADA (777 controls and 798 cases), and NCRAD (1310 controls and 1289 cases) initiatives. GWAS analysed with PLINK, following p‐value filtering for the initial dimension reduction. Random Forest (RF) implemented with 5‐fold cross‐validation (CV) using RANGER package in R. LOAD RF model variants mapped and consensus genes are selected. Results Test performances of LOAD‐RF models of ADNI, NCRAD and GenADA datasets were 72,9%, 68,8%, and 92,4% respectively. 390 variants from ADNI, 1740 from NCRAD, and 434 from GenADA datasets were included in the individual LOAD‐RF models.No consensus variants are observed, but when the genomic locations of the variants are mapped, 62 genes common to at least two datasets are identified. 5 out of 6 genes common in all three LOAD‐RF models were also found to be reported as differentially expressed in the AD vs control subclusters of astrocyte cells (LFC>0.5, FDR = 0.01) (Ref1). Enrichment analysis based on consensus genes revealed cell adhesion as the top GO term for biological processes. Conclusion The meta‐analysis of GWAS datasets for LOAD is performed, and the variants common in three predictive models at the gene level are analyzed. In the next phase of this study, the predictive performance of 6 variants are selected will be evaluated from the saliva samples collected from Turkish AD patients and controls. The outcomes of this study and know‐how acquired will allow us to offer an early/differential diagnostic tool based on the genotyping of the consensus variants for the early and differential diagnosis of the LOAD. Reference: (1) Grubman, A., et al. Nat Neurosci 22, 2087–2097 (2019).

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here