Premium
Ratios, regression statistics, and “spurious” correlations
Author(s) -
Berges John A.
Publication year - 1997
Publication title -
limnology and oceanography
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.7
H-Index - 197
eISSN - 1939-5590
pISSN - 0024-3590
DOI - 10.4319/lo.1997.42.5.1006
Subject(s) - queen (butterfly) , spurious relationship , library science , citation , geography , statistics , genealogy , history , biology , mathematics , computer science , zoology , hymenoptera
Graphical representation and curve fitting are often needed to interpret the complex relationships between measured quantities in aquatic systems. Although graphical and statistical computer packages have made it simpler to perform analyses, they cannot overcome the statistical problems of certain data manipulations. For example, one means of expressing relationships between two variables, A and B, is to plot the ratio A& I against B. Inspection of recent issues of ~~~~uZ~gy und Oceanog-raphy reveals the use of this plot in a range of applications, where it has been used to demonstrate that the percentage particulate organic carbon in Amazonian rivers decreases as total suspended solids increase (Hedges et al. 1994), that epilimnetic total N : total P decreases as total P increases in lakes worldwide (Kopacek et al. 1993, that the germanium : silicate ratio declines in glacial meltwater streams as silicate increases (Chillrud et al. 1994), that mass-specific growth rates of Cladocera decline with increases in mass (Anderson and Benke 1994), and that weight-specific ammonium release declines with increasing weight in freshwater zoo-plankton (Haga et al. 1995). No criticism of these studies is implied (in fact, some have been exceptionally careful to avoid the pitfalls that follow); the purpose in citing them is to illustrate the widespread use of this type of plot. B is involved in both X and Y axes on the plot. Intuitively, if A and B were not at all correlated, we might expect that as B became larger, there would be a tendency for AB-' to become smaller. In fact, this tendency is far stronger than is generally appreciated. To illustrate, a common pseudo-random number generator (a function in Microsoft Excel 5.0) was used to generate two sets of 500 numbers, evenly distributed over the interval O-10. Linear correlation of A and B shows no significant relationship (Fig. la; r = 0.020). If, however, AB-' vs. B is plotted, an exponential decline in AB-' with increasing B is seen (Fig. lb), which appears linear on a log-log plot (Fig. lc). If these curves are fit to the model AB-' = aBb, an apparently satisfying fit is obtained (Fig. lb, c). The model can be fit with a nonlinear algorithm (Marquardt-Leven-berg; SigmaPlot for Windows 2.0), giving a = 5.15 and b =-0.990, or a linear regression on the log-transformed data can be performed, resulting in similar parameter estimates and r = 0.67. Th e nonlinear fitting …