z-logo
Premium
Visualization of deep learning relevance maps for AD detection
Author(s) -
Budding Celine,
EitelAlbrecht FabianJanPhilipp,
Ritter Kerstin
Publication year - 2020
Publication title -
alzheimer's and dementia
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.713
H-Index - 118
eISSN - 1552-5279
pISSN - 1552-5260
DOI - 10.1002/alz.037352
Subject(s) - convolutional neural network , relevance (law) , artificial intelligence , computer science , pattern recognition (psychology) , voxel , deep learning , set (abstract data type) , test set , data set , neuroimaging , clinical significance , cohort , artificial neural network , alzheimer's disease neuroimaging initiative , machine learning , cognition , medicine , psychology , cognitive impairment , pathology , neuroscience , political science , law , programming language
Background Deep learning methods, in particular convolutional neural networks (CNNs), have been very successfully applied to the diagnosis of AD and MCI based on structural MRI data. However, since the training of those models needs the adaptation of many parameters, they are often criticized for being a “black‐box”. Here, we compute heatmaps using layer‐wise relevance propagation (LRP) to (1) validate CNN models and (2) explain neural network decision in single patients. Furthermore, we investigated whether the resulting heatmaps help in identifying subgroups of AD and MCI. Method We included 2964 T1‐weighted MR scans of 732 participants (185 AD, 116 MCI‐converters, 211 MCI‐non converters and 210 Cognitive Normal [CN]) from the ADNI data base. The AD/CN cohort were split into a training/validation (80 %) and a test cohort (20 %) based on participant ID. A CNN model consisting of four 3D convolutional layers was then trained to separate patients with AD and CN. The model was tested on the AD/CN test set and applied to MCI‐converters / MCI‐non‐converters. Using LRP, we produced for each MRI scan a heatmap in the input space, indicating the relevance of each voxel for the final classification decision. The heatmaps were clustered using spectral relevance analysis. Result The CNN model resulted in a balanced accuracy of 85.12 % on the AD/CN test set and 67.37 % on the MCI‐converter / MCI‐non‐converter set. We additionally evaluated the CNN models with respect to different subgroups, including gender (88.11 % for females, 82.10 % for males) and age (92.57 % for 60‐73 years, 89.06 % for 74‐78 years, 74.17 % for 79‐90 years). Among regions with the highest LRP evidence for AD and MCI are hippocampus and parahippocampal gyrus. Spectral clustering revealed a higher variability within the AD and MCI subgroups than for CN. Conclusion In this study, we show the potential of a transparent deep learning framework in order to understand the underlying computing principles and their effect on clinical outcomes such as clinical diagnosis (AD vs. MCI vs. CN), gender and age. Future studies are necessary to address the robustness of heatmaps methods and their applicability in clinical routine.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here