Premium
Classification of GC‐MS measurements of wines by combining data dimension reduction and variable selection techniques
Author(s) -
Ballabio Davide,
Skov Thomas,
Leardi Riccardo,
Bro Rasmus
Publication year - 2008
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.1173
Subject(s) - overfitting , linear discriminant analysis , principal component analysis , feature selection , dimensionality reduction , selection (genetic algorithm) , chemometrics , mathematics , dimension (graph theory) , partial least squares regression , pattern recognition (psychology) , reduction (mathematics) , statistics , artificial intelligence , chromatography , computer science , chemistry , artificial neural network , pure mathematics , geometry
Different classification methods (Partial Least Squares Discriminant Analysis, Extended Canonical Variates Analysis and Linear Discriminant Analysis), in combination with variable selection approaches (Forward Selection and Genetic Algorithms), were compared, evaluating their capabilities in the geographical discrimination of wine samples. Sixty‐two samples were analysed by means of dynamic headspace gas chromatography mass spectrometry (HS‐GC‐MS) and the entire chromatographic profile was considered to build the dataset. Since variable selection techniques pose a risk of overfitting when a large number of variables is used, a method for coupling data dimension reduction and variable selection was proposed. This approach compresses windows of the original data by retaining only significant components of local Principal Component Analysis models. The subsequent variable selection is then performed on these locally derived score variables. The results confirmed that the classification models achieved on the reduced data were better than those obtained on the entire chromatographic profile, with the exception of Extended Canonical Variates Analysis, which gave acceptable models in both cases. Copyright © 2008 John Wiley & Sons, Ltd.