z-logo
Premium
Relevant and irrelevant predictors in PLS2
Author(s) -
Stocchero Matteo
Publication year - 2020
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.3237
Subject(s) - feature selection , partial least squares regression , latent variable , measure (data warehouse) , cluster analysis , stability (learning theory) , regression , computer science , mathematics , variable (mathematics) , data mining , regression analysis , redundancy (engineering) , feature (linguistics) , linear regression , artificial intelligence , statistics , machine learning , mathematical analysis , linguistics , philosophy , operating system
Partial least square regression (PLS) is largely applied to solve regression problems when correlation and redundancy are present in the data. In spite of many studies about feature selection and variable importance have been published, to select the subset of relevant features useful to explain the behaviour of the system under investigation and the subset of irrelevant predictors that can be ignored is still an open issue. Here, a new strategy to measure variable importance is introduced, and a wrapper method is proposed for selecting relevant and irrelevant predictors. The variable importance measure is developed grouping the predictors in classes of equivalent features by clustering in the latent space and considering the variations of the goodness of the PLS2 model generated perturbing the block of the predictors. The wrapper method implements stability selection using bootstrap and feature selection. The behaviour of the new variable importance score and its use within the wrapper method are discussed investigating two simulated and one real data set.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here