z-logo
Premium
Centering and scaling in component analysis
Author(s) -
Bro Rasmus,
Smilde Age K.
Publication year - 2003
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.773
Subject(s) - scaling , projection (relational algebra) , multidimensional scaling , focus (optics) , bilinear interpolation , computer science , algorithm , preprocessor , component (thermodynamics) , mode (computer interface) , missing data , mathematics , artificial intelligence , geometry , physics , machine learning , optics , computer vision , thermodynamics , operating system
In this paper the purpose and use of centering and scaling are discussed in depth. The main focus is on two‐way bilinear data analysis, but the results can easily be generalized to multiway data analysis. In fact, one of the scopes of this paper is to show that if two‐way centering and scaling are understood, then multiway centering and scaling is quite straightforward. In the literature it is often stated that preprocessing of multiway arrays is difficult, but here it is shown that most of the difficulties do not pertain to three‐ and higher‐way modeling in particular. It is shown that centering is most conveniently seen as a projection step, where the data are projected onto certain well‐defined spaces within a given mode. This view of centering helps to explain why, for example, centering data with missing elements is likely to be suboptimal if there are many missing elements. Building a model for data consists of two parts: postulating a structural model and using a method to estimate the parameters. Centering has to do with the first part: when centering, a model including offsets is postulated. Scaling has to do with the second part: when scaling, another way of fitting the model is employed. It is shown that centering is simply a convenient technique to estimate model parameters for models with certain offsets, but this does not work for all types of offsets. It is also shown that scaling is a way to fit models with a weighted least squares loss function and that sometimes this change in objective function cannot be performed by a simple scaling step. Further practical aspects of and alternatives to centering and scaling are discussed, and examples are used throughout to show that the conclusions in the paper are not only of theoretical interest but can have an impact on practical data analysis. Copyright © 2003 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here