z-logo
Premium
Choosing the number of factors in partial least squares regression: estimating and minimizing the mean squared error­ of prediction
Author(s) -
Denham Michael C.
Publication year - 2000
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/1099-128x(200007/08)14:4<351::aid-cem598>3.0.co;2-q
Subject(s) - partial least squares regression , bootstrapping (finance) , statistics , mean squared error , regression , linear regression , regression analysis , variance (accounting) , mathematics , cross validation , mean squared prediction error , computer science , econometrics , accounting , business
We investigate a number of approaches to estimating the mean squared error of prediction (MSEP) in partial least squares (PLS) regression without resorting to external validation. Using two simulation examples based on real data, performances of the methods are evaluated in terms of their accuracy and their usefulness in determining the optimal number of factors to include in the PLS model. We find that for problems with relatively few variables, methods based on ignoring the effect of non‐linearity in PLS regression or using a linear approximation give good estimates of MSEP, with little to choose between them. However, where linear approximation is feasible, we prefer it, since it gives estimates of MSEP which have lower bias and variance than cross‐validation. In situations where there are large numbers of variables, these methods break down. In these circumstances, cross‐validation and bootstrapping methods are better able to capture the changes in MSEP with the number of factors fitted and thus are more useful for identifying the optimal PLS regression model. Copyright © 2000 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here