Premium
Sparse common and distinctive covariates regression
Author(s) -
Park Soogeun,
Ceulemans Eva,
Van Deun Katrijn
Publication year - 2021
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.3270
Subject(s) - covariate , computer science , regression , identification (biology) , feature selection , predictive modelling , regression analysis , selection (genetic algorithm) , outcome (game theory) , machine learning , artificial intelligence , data mining , statistics , econometrics , mathematics , biology , botany , mathematical economics
Having large sets of predictors from multiple sources concerning the same observation units and the same criterion is becoming increasingly common in chemometrics. When analyzing such data, chemometricians often have multiple objectives: prediction of the criterion, variable selection, and identification of underlying processes associated to individual predictor sources or to several sources jointly. Existing methods offer solutions regarding the first two aims of uncovering the predictive mechanisms and relevant variables therein for a single block of predictor variables, but the challenge of uncovering joint and distinctive predictive mechanisms and the relevant variables therein in the multisource setting still needs to be addressed. To this end, we present a multiblock extension of principal covariates regression that aims to find the complex mechanisms in which several or single sources may be involved; taken together, these mechanisms predict an outcome of interest. We call this method sparse common and distinctive covariates regression (SCD‐CovR). Through a simulation study, we demonstrate that SCD‐CovR provides competitive solutions when compared with related methods. The method is also illustrated via an application to a publicly available dataset.