Premium
Detecting characteristic hydrological and biogeochemical signals through nonparametric scatter plot analysis of normalized data
Author(s) -
Green Mark B.,
Finlay Jacques C.
Publication year - 2008
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1029/2007wr006509
Subject(s) - smoothing , watershed , biogeochemical cycle , nonparametric statistics , normalization (sociology) , data set , environmental science , computer science , statistics , data mining , mathematics , machine learning , ecology , sociology , anthropology , biology
Analysis of multisite data sets is often limited by the prevalence of site‐specific phenomena or obscured by interactions among many variables. We outline two techniques for extracting characteristic hydrological and biogeochemical signals from large data sets using data normalization and nonparametric scatterplot analyses. Both techniques use data normalization to minimize the site‐specific signal on hydrological or biogeochemical variables, allowing many localities to be analyzed together. Nonetheless, normalized data are often noisy, masking characteristic hydrological and biogeochemical signals. We employed nonparametric scatterplot smoothing and thinning techniques to extract signals from normalized data. To illustrate this approach, we applied these techniques to a data set for stream chemistry and discharge consisting of 57 minimally impacted watersheds from the contiguous United States. Using the entire data set, our analyses showed characteristic seasonal trends of stream discharge ( Q ) and total nitrogen ( TN ) concentration. The influence of Q on TN was evaluated with scatterplot thinning. Subsets of the data, sorted by watershed area and mean annual precipitation, were analyzed with smoothing and thinning techniques, demonstrating characteristic dynamics in watershed classes. Overall, these data analysis techniques uncovered some trends that were intuitive and others that were not. The techniques are useful for synthesizing large watershed data sets and identifying general trends of watershed variables at regional scales, which can be used in concert with other literature or data synthesis methods to describe fundamental watershed processes.