Premium
Predictive modeling of schizophrenia from genomic data: Comparison of polygenic risk score with kernel support vector machines approach
Author(s) -
VivianGriffiths Timothy,
Baker Emily,
Schmidt Karl M.,
BracherSmith Matthew,
Walters James,
Artemiou Andreas,
Holmans Peter,
O'Donovan Michael C.,
Owen Michael J.,
Pocklington Andrew,
EscottPrice Valentina
Publication year - 2019
Publication title -
american journal of medical genetics part b: neuropsychiatric genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.393
H-Index - 126
eISSN - 1552-485X
pISSN - 1552-4841
DOI - 10.1002/ajmg.b.32705
Subject(s) - support vector machine , kernel (algebra) , artificial intelligence , machine learning , polygenic risk score , computer science , pattern recognition (psychology) , schizophrenia (object oriented programming) , multivariate statistics , kernel method , mathematics , single nucleotide polymorphism , biology , genetics , combinatorics , genotype , gene , programming language
A major controversy in psychiatric genetics is whether nonadditive genetic interaction effects contribute to the risk of highly polygenic disorders. We applied a support vector machines (SVMs) approach, which is capable of building linear and nonlinear models using kernel methods, to classify cases from controls in a large schizophrenia case–control sample of 11,853 subjects (5,554 cases and 6,299 controls) and compared its prediction accuracy with the polygenic risk score (PRS) approach. We also investigated whether SVMs are a suitable approach to detecting nonlinear genetic effects, that is, interactions. We found that PRS provided more accurate case/control classification than either linear or nonlinear SVMs, and give a tentative explanation why PRS outperforms both multivariate regression and linear kernel SVMs. In addition, we observe that nonlinear kernel SVMs showed higher classification accuracy than linear SVMs when a large number of SNPs are entered into the model. We conclude that SVMs are a potential tool for assessing the presence of interactions, prior to searching for them explicitly.