Premium
Comparison of multivariate linear regression methods in micro‐Raman spectrometric quantitative characterization
Author(s) -
Farkas Attila,
Vajna Balázs,
Sóti Péter L.,
Nagy Zsombor K.,
Pataki Hajnalka,
Van der Gucht Filip,
Marosi György
Publication year - 2015
Publication title -
journal of raman spectroscopy
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.748
H-Index - 110
eISSN - 1097-4555
pISSN - 0377-0486
DOI - 10.1002/jrs.4672
Subject(s) - partial least squares regression , univariate , principal component regression , multivariate statistics , statistics , principal component analysis , linear regression , mathematics , cross validation , regression analysis , bilinear interpolation
Chemical imaging was used in this study as a powerful analytical tool to characterize pharmaceuticals in solid form. The majority of analyses are evaluated with bilinear modelling using only the pure component spectra or just the chemical images themselves to estimate the concentrations in each pixel, which are far from true quantitative determination. Our aim was to create more accurate concentration images using regression methods. For the first time in chemical imaging, variable selections with interval partial least squares (PLS) and with genetic algorithms (PLS‐GA) were applied to increase the efficiency of the models. These were compared to numerous bilinear modelling and multivariate linear regression methods such as univariate regression, classical least squares (CLS), multivariate curve resolution–alternating least squares (MCR‐ALS), principal component regression (PCR) and partial least squares (PLS). Two component spray‐dried pharmaceuticals were used as a model. The paper is shown that, in contrast to the usual way of using either external validation or cross‐validation, both should be performed simultaneously in order to get a clear picture of the prediction errors and to be able to select the appropriate models. Using PLS with variable selection, the root mean square errors were reduced to 3% per pixel by keeping only those peaks that are truly necessary for the estimation of concentrations. It is also shown that interval PLS can point out the best peak for univariate regression, and can thereby be of great help even when regulations allow only univariate models for product quality testing. Variable selection, besides yielding more accurate overall concentrations across a Raman map, also reduces the deviation among pixel concentrations within the images, thereby increasing the sensitivity of homogeneity studies. Copyright © 2015 John Wiley & Sons, Ltd.