z-logo
Premium
WATERSHED CLASSIFICATION USING CANONICAL CORRESPONDENCE ANALYSIS AND CLUSTERING TECHNIQUES: A CAUTIONARY NOTE 1
Author(s) -
Caratti John F.,
Nesser John A.,
Maynard C.
Publication year - 2004
Publication title -
jawra journal of the american water resources association
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.957
H-Index - 105
eISSN - 1752-1688
pISSN - 1093-474X
DOI - 10.1111/j.1752-1688.2004.tb01584.x
Subject(s) - multivariate statistics , watershed , hierarchical clustering , cluster analysis , canonical correlation , environmental data , correspondence analysis , multivariate analysis , computer science , data mining , scale (ratio) , variable (mathematics) , statistics , mathematics , geography , machine learning , cartography , ecology , mathematical analysis , biology
Watershed classification using multivariate techniques requires the incorporation of continuous datasets representing controlling environmental variables. Often, out of convenience and availability rather than importance to the structure of the system being modeled, the environmental data used originate from a variety of sources and scales. To demonstrate the importance of appropriate environmental data selection, classifications of six‐digit hydrologic units (1:24,000) across selected geographic areas within the Interior Columbia River Basin were produced. Canonical correspondence analysis was used to select and test environmental variables important in predicting Rosgen stream types and valley bottom classes. Then, hierarchical agglomerative clustering was used to group (classify) watersheds based on these variables. Statistically significant results were derived from the use of organized classification data with presumed predictive relationships to watershed properties, and a random distribution of environmental variables from the same datasets provided similar results. The results contained herein demonstrate that these analysis techniques do not necessarily select meaningful variables from a broad spectrum of data and that significant results are easily generated from randomly associated data. It is suggested that classifications produced using these multivariate techniques, especially when using multi‐scale data or data of unknown significance, are subject to invalid inferences and should be used with caution.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here