Premium
Robustness control in bilinear modeling based on maximum correntropy
Author(s) -
Fonseca Diaz Valeria,
De Ketelaere Bart,
Aernouts Ben,
Saeys Wouter
Publication year - 2020
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.3215
Subject(s) - robustness (evolution) , outlier , partial least squares regression , weighting , computer science , regression , robust regression , mathematics , bilinear interpolation , artificial intelligence , statistics , biochemistry , chemistry , gene , medicine , radiology
We present the development of a bilinear regression model for multivariate calibration on the basis of maximum correntropy criteria (MCC) whose robustness can be easily controlled. MCC regression methods can be more effective when the assumption of normality does not hold or when data are contaminated with outliers. These methods are competitive when the degree of robustness against outliers should be controlled. By controlling the robustness, information from candidate outliers can be partially retained rather than completely included or discarded during calibration. Within the context of bilinear regression models, an MCC approach using statistically inspired modification of the partial least squares (SIMPLS) is proposed, which is named maximum correntropy‐weighted partial least squares (MCW‐PLS). Thanks to the controllable robustness of MCC models, observations are upweighted or downweighted during the calibration process, rendering robust models with soft discrimination of samples. Such a weighting represents an important advantage, especially for cases when samples are not drawn from a normal distribution. Applications to three real case studies are presented. These applications uncovered three main features of MCW‐PLS: robustness control between SIMPLS and robust SIMPLS (RSIMPLS), improvements in prediction performance of bilinear calibration models, and the possibility to detect the most informative samples in a calibration set.