Premium
Robust and classical PLS regression compared
Author(s) -
Liebmann Bettina,
Filzmoser Peter,
Varmuza Kurt
Publication year - 2010
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1279
Subject(s) - outlier , data set , robust regression , multivariate statistics , regression , computer science , regression analysis , linear regression , set (abstract data type) , test set , statistics , artificial intelligence , data mining , mathematics , pattern recognition (psychology) , machine learning , programming language
Abstract Classical PLS regression is a well‐established technique in multivariate data analysis. Since classical PLS is known to be severely affected by the presence of outliers in the data or deviations from normality, several PLS regression methods with robust behavior towards data contamination have been proposed. We compare the performance of the classical SIMPLS approach with the partial robust M regression (PRM). Both methods are applied to three different data sets including outliers intentionally created. A simulated data set with known true model parameters allows insight in the modeling performance with increasing data contamination. QSPR data are modified with a cluster of outlying observations. A third data set from near infrared (NIR) spectroscopy is likely to include noise and experimental errors already in the original variables, and is further contaminated with outliers. To provide a sound comparison of the considered methods we apply repeated double cross validation. This validation procedure judiciously optimizes the model complexity (number of PLS components) and estimates the models' prediction performance based on test‐set predicted errors. All studied robust regression models outperform the classical PLS models when outlying observations are present in the data. For uncontaminated data, the prediction performances of both the classical and the robust models are in the same range. Copyright © 2010 John Wiley & Sons, Ltd.