
Sparse multiple factor analysis to integrate genetic data, neuroimaging features, and attention‐deficit/hyperactivity disorder domains
Author(s) -
VilorTejedor Natàlia,
Alemany Silvia,
Cáceres Alejandro,
Bustamante Mariona,
Mortamais Marion,
Pujol Jesús,
Sunyer Jordi,
González Juan R.
Publication year - 2018
Publication title -
international journal of methods in psychiatric research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.275
H-Index - 73
eISSN - 1557-0657
pISSN - 1049-8931
DOI - 10.1002/mpr.1738
Subject(s) - attention deficit hyperactivity disorder , lasso (programming language) , neuroimaging , univariate , multivariate statistics , psychology , feature selection , bivariate analysis , population , multivariate analysis , clinical psychology , artificial intelligence , machine learning , psychiatry , computer science , medicine , environmental health , world wide web
Objectives We proposed the application of a multivariate cross‐sectional framework based on a combination of a variable selection method and a multiple factor analysis (MFA) in order to identify complex meaningful biological signals related to attention‐deficit/hyperactivity disorder (ADHD) symptoms and hyperactivity/inattention domains. Methods The study included 135 children from the general population with genomic and neuroimaging data. ADHD symptoms were assessed using a questionnaire based on ADHD‐DSM‐IV criteria. In all analyses, the raw sum scores of the hyperactivity and inattention domains and total ADHD were used. The analytical framework comprised two steps. First, zero‐inflated negative binomial linear model via penalized maximum likelihood (LASSO‐ZINB) was performed. Second, the most predictive features obtained with LASSO‐ZINB were used as input for the MFA. Results We observed significant relationships between ADHD symptoms and hyperactivity and inattention domains with white matter, gray matter regions, and cerebellum, as well as with loci within chromosome 1. Conclusions Multivariate methods can be used to advance the neurobiological characterization of complex diseases, improving the statistical power with respect to univariate methods, allowing the identification of meaningful biological signals in Imaging Genetic studies.