Premium
“Battered data”—Some clinical effects of the abuse of multiple regression methods: The NSD
Author(s) -
Herbert Donald E.
Publication year - 1981
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1118/1.595034
Subject(s) - structural equation modeling , set (abstract data type) , mathematics , constraint (computer aided design) , statistics , data set , computer science , geometry , programming language
The NSD equation, D = 1850× T 0.11 × N 0.24 , is a celebrated transmogrification of Cohen's two well‐known collations of data on response to clinical irradiations (3° erythema and 0.90 ablation of skin cancer) in which the relation of D and T is fixed by the data selected by Cohen and the additional constraint that N correspond to a schedule of five treatments per week is imposed subsequently by Ellis. The present paper shows that the equation, if correct, would have little clinical significance because the proportion, P , in which the dose, D , elicits the 3° erythema is unspecified: D = D ( P ) = D (?). Since the two Cohen collations each summarize the measurements on a different set of observational units, it is questionable whether the equation can be correct. This paper further shows that the appropriate (Least Squares) estimates of the three‐parameter equation derived for the Cohen data under the Ellis constraint (five treatments per week) is in fact: D(1.0)=1710×T 0.54×N −0.26 . The paper shows that the NSD equation is also incorrect because the ad hoc method by which Ellis estimates the exponents is inconsistent with the constraints imposed by Cohen and Ellis upon the parameters of the multivariate frequency distribution of the data set. The paper shows that the method by which the correct LS estimates of the exponents were obtained from the Cohen–Ellis data is consistent with these constraints and, therefore, this equation is a correct graduation of any other set of treatment regimens which is also consistent with the Cohen and Ellis constraints. The paper further shows that for such data sets there are, in fact, only two independent continous variables, either D and T or D and N , since the Ellis constraint requires that N and T be collinear . Thus, the best linear graduation has the typical form: D ≃1900 T 0.32 . This is “best” in the usual sense: both prediction and confidence intervals are provided for the estimates of the conditional “tolerance dose” D ; these are not inflated by the presence of a collinear variable. This equation is biased, however, by the absence of the collinear variable. The TDF and the CRE concepts are derived from the NSD and, therefore, the deficiencies of the latter concept which we discuss may be expected to encumber these progeny as well. The two characteristic features of the Cohen (Ellis)‐type data which impede the construction of useful estimates of the putative separate effects of N and T upon the response of tissues to irradiation are that (1) these data do not include specifications of either a tissue defect or its incidence , and (2) the variables N and T are collinear . Appendices I and II describe methods by which the effects of these features may be eliminated (I) or reduced (II). Key words: NSD, response surface, dose response equations, isoeffect equations, management equations, least squares, constrained least squares, collinearity, factoral experiments, ridge regression, principal components