A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part II. Practical applications | Zendy

Baumann K. | Zendy; von Korff M. | Zendy; Albert H. | Zendy

Premium

A systematic evaluation of the benefits and hazards of variable selection in latent variable regression. Part II. Practical applications

Author(s) -

Baumann K.,

von Korff M.,

Albert H.

Publication year - 2002

Publication title -

journal of chemometrics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.47

H-Index - 92

eISSN - 1099-128X

pISSN - 0886-9383

DOI - 10.1002/cem.729

Subject(s) - loo , latent variable , mathematics , feature selection , selection (genetic algorithm) , variable (mathematics) , statistics , cross validation , function (biology) , regression analysis , quality (philosophy) , linear regression , artificial intelligence , computer science , machine learning , mathematical analysis , philosophy , epistemology , evolutionary biology , quantitative structure–activity relationship , biology

Leave‐multiple‐out cross‐validation (LMO‐CV) is compared to leave‐one‐out cross‐validation (LOO‐CV) as objective function in variable selection for four real data sets. Two data sets stem from NIR spectroscopy and two from quantitative structure–activity relationships. In all four cases, LMO‐CV outperforms LOO‐CV with respect to prediction quality, model complexity (number of latent variables) and model size (number of variables). The number of objects left out in LMO‐CV has an important effect on the final results. It controls both the number of latent variables in the final model and the prediction quality. The results of variable selection need to be validated carefully with a validation step that is independent of the variable selection. This step needs to be done because the internal figures of merit (i.e. anything that is derived from the objective function value) do not correlate well with the external predictivity of the selected models. This is most obvious for LOO‐CV. LOO‐CV without further constraints always shows the best internal figures of merit and the worst prediction quality. Copyright © 2002 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore