z-logo
Premium
An optimization‐based undeflated PLS (OUPLS) method to handle missing data in the training set
Author(s) -
PuwakkatiyaKankanamage Eranda Harinath,
GarcíaMuñoz Salvador,
Biegler Lorenz T.
Publication year - 2014
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.2618
Subject(s) - partial least squares regression , missing data , latent variable , computer science , principal component analysis , projection (relational algebra) , set (abstract data type) , artificial intelligence , data mining , pattern recognition (psychology) , machine learning , algorithm , programming language
Advances in sensory systems have led to many industrial applications with large amounts of highly correlated data, particularly in chemical and pharmaceutical processes. With these correlated data sets, it becomes important to consider advanced modeling approaches built to deal with correlated inputs in order to understand the underlying sources of variability and how this variability will affect the final quality of the product. Additional to the correlated nature of the data sets, it is also common to find missing elements and noise in these data matrices. Latent variable regression methods such as partial least squares or projection to latent structures (PLS) have gained much attention in industry for their ability to handle ill‐conditioned matrices with missing elements. This feature of the PLS method is accomplished through the nonlinear iterative PLS (NIPALS) algorithm, with a simple modification to consider the missing data. Moreover, in expectation maximization PLS (EM‐PLS), imputed values are provided for missing data elements as initial estimates, conventional PLS is then applied to update these elements, and the process iterates to convergence. This study is the extension of previous work for principal component analysis (PCA), where we introduced nonlinear programming (NLP) as a means to estimate the parameters of the PCA model. Here, we focus on the parameters of a PLS model. As an alternative to modified NIPALS and EM‐PLS, this paper presents an efficient NLP‐based technique to find model parameters for PLS, where the desired properties of the parameters can be explicitly posed as constraints in the optimization problem of the proposed algorithm. We also present a number of simulation studies, where we compare effectiveness of the proposed algorithm with competing algorithms. Copyright © 2014 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here