z-logo
Premium
Identifying key features for dementia diagnosis using machine learning
Author(s) -
Guest Felicity,
Kuzma Elzbieta,
Everson Richard,
Llewellyn David J.
Publication year - 2020
Publication title -
alzheimer's and dementia
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.713
H-Index - 118
eISSN - 1552-5279
pISSN - 1552-5260
DOI - 10.1002/alz.046092
Subject(s) - dementia , frontotemporal dementia , categorical variable , neuropsychology , random forest , missing data , artificial intelligence , medicine , disease , psychology , machine learning , cognition , psychiatry , computer science , pathology
Background Diagnosing dementia is clinically challenging due to substantial heterogeneity in presenting symptoms and underlying pathologies. However, machine learning provides a powerful approach to uncover hidden patterns in complex data that may in turn inform clinical practice. Method We used patient data from the National Alzheimer’s Coordinating Center Uniform Data Set (versions 1.2 and 2) collected at 35 Alzheimer’s Disease Centers across the United States between September 2005 and February 2015. All 32,573 patients (median age=73 years, 57% female) were assessed according to standardized protocols involving extensive clinical and neuropsychological assessments which resulted in a pool of 260 continuous, binary, ordinal or categorical features. Patients were randomly split 70:30 into training (n=22,801) and test (n=9,772) samples. All‐cause dementia and dementia subtypes (Alzheimer’s disease, vascular, Lewy body, frontotemporal and ‘other’) were diagnosed by either a single clinician or consensus panel using established international criteria. Random forests (Extra‐Trees algorithm) were used to differentiate between all‐cause dementia and no dementia, and between different subtypes, through the construction of an ensemble of decision trees which makes classifications via voting. We used similarity‐based imputation and the missingness incorporated in attributes approach to deal with missing data whilst preserving relationships between variables. Result The all‐cause dementia classifier incorporating all 260 features was able to differentiate between dementia and no dementia with an accuracy of 94.2% in the test sample (sensitivity 92.8% and specificity of 95.1%). The most important features related to impaired cognitive domains, particularly judgement, planning or problem solving and orientation, and everyday activities such as home and hobbies, community affairs and difficulty with bills. It was found the performance of the original classifier could be matched with just the 42 most important features. Subtype classifiers ranged in accuracy from 80.8% for frontotemporal dementia versus ‘other’ dementia to 99.4% for Alzheimer’s disease versus vascular dementia. The most important features for differential diagnosis included stroke history, visual hallucinations and Parkinson’s disease. Conclusion Machine learning can be used to uncover hidden patterns in high‐dimensional clinical data and identify the most important characteristics for accurately diagnosing dementia.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here