Premium
A PLS kernel algorithm for data sets with many variables and few objects. Part II: Cross‐validation, missing data and examples
Author(s) -
Rännar Stefan,
Geladi Paul,
Lindgren Fredrik,
Wold Svante
Publication year - 1995
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1180090604
Subject(s) - missing data , kernel (algebra) , data set , set (abstract data type) , cross validation , multivariate statistics , algorithm , calibration , series (stratigraphy) , computer science , mathematics , data mining , statistics , combinatorics , programming language , paleontology , biology
This is Part II of a series concerning the PLS kernel algorithm for data sets with many variables and few objects. Here the issues of cross‐validation and missing data are investigated. Both partial and full crossvalidation are evaluated in terms of predictive residuals and speed and are illustrated on real examples. Two related approaches to the solution of the missing data problem are presented. One is a full EM algorithm and the second a reduced EM algorithm which applies when the number of missing values is small. The two examples are multivariate calibration data sets. The first set consists of UV–visible data measured on mixtures of four metal ions. The second example consists of FT‐IR measurements on mixtures consisting of four different organic substances.