z-logo
Premium
A random number experiment to simulate resample model evaluations
Author(s) -
Mager Peter P.
Publication year - 1996
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/(sici)1099-128x(199605)10:3<221::aid-cem412>3.0.co;2-2
Subject(s) - resampling , randomness , mathematics , statistics , statistical hypothesis testing , multivariate statistics , sample size determination , econometrics
Gaussian distributed scores lying within the range from +4 to −4 were calculated using a random generator. The random sample was divided sequentially into three subsamples with equal and unequal sizes. This classification leads to different internal correlation‐regression structures depending on the subsample size. The subsamples show also departures from multivariate normality. This is misleading for hypothesis testing. Randomness of raw and vector‐valued observations is unpredictable after subsampling. Furthermore, the chance to get outlying observations must be taken into account. Two sets of variables drawn from the subsamples did not show remarkable relationships. The situation changed after omission of certain variables and subsample members. The formally significant equations are artifacts and show the real danger of a selection of regressors extracted from a variable pool of individual subsamples. Consequently, sequential resampling is unsuitable for testing statistical hypotheses. There remains the big question in the daily practice of chemometrics and in QSAR and 3D QSAR designs of how to select randomly the number of groups and their sizes. However, sequential resampling may be useful for diagnostic statistics which prove the assumptions of the underlying theory of hypothesis testing.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here