z-logo
Premium
Goodness‐of‐fit using very small but related samples with application to censored data estimation of PCB contamination
Author(s) -
Johnson Richard A.,
Gan D. Robert,
Berthouex P. M.
Publication year - 1995
Publication title -
environmetrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.68
H-Index - 58
eISSN - 1099-095X
pISSN - 1180-4009
DOI - 10.1002/env.3170060403
Subject(s) - statistics , normality , data set , sample (material) , sample size determination , normal distribution , set (abstract data type) , computer science , goodness of fit , econometrics , order statistic , environmental data , data quality , quality (philosophy) , mathematics , data mining , engineering , metric (unit) , philosophy , chemistry , operations management , epistemology , chromatography , programming language , political science , law
Many environmental data sets contain observations which are below the method or instrumental limit of detection which are often referred to as ‘censored data’. These censored data create great uncertainties in the data analysis for the environmental quality management. It is necessary to quantify these uncertainties in order to determine compliance, or non‐compliance, with relevant environmental quality standards, to estimate the parameters used in assessment models, and to monitor the environmental quality. The analyses for data sets that contain censored data are not new, but most distribution free methods require large sample sizes. For a data set containing censored observations with a small size, e.g. n = 3, which is typically observed for environmental samples, distribution free techniques will fail to provide reasonable answers due to the tremendous uncertainties in the analysis. This paper is proposed to demonstrate one approach which could be used to analyse environmental data sets with small sample sizes. If a distribution is known to be normal, or log‐normal, optimal estimates are easily obtained. In the scientific investigation that motivated this study, at most three observations are available. We address two statistical questions. The first concerns normality. If all the populations are normal, even with different means and variances, we can obtain an overall check of fit. To do so, we propose to calculate the correlation between the three order statistics and the normal scores. The distribution of the whole set of correlations is then compared to what it would be if all of the populations are normal. Secondly, we consider transformations of the original observations and illustrate how our goodness‐of‐fit procedure can be used to select a transformation.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here