z-logo
Premium
Data management of river water quality data: A semi‐automatic procedure for data validation
Author(s) -
Clement L.,
Thas O.,
Ottoy J. P.,
Vanrolleghem P. A.
Publication year - 2007
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1029/2006wr005187
Subject(s) - univariate , outlier , multivariate statistics , data mining , anomaly detection , data quality , computer science , statistics , mathematics , artificial intelligence , engineering , metric (unit) , operations management
Monitoring networks typically generate large amounts of data. Before the data can be added to the database, they have to be validated. In this paper, a semi‐automatic procedure is presented to validate river water quality data. On the basis of historical data, additive models are established to predict new observations and to construct prediction intervals (PI's). A new observation is accepted if it is located in the interval. The coverage of the prediction intervals and its power to detect anomalous data are assessed in a simulation study. The method is illustrated on two case studies in which the method detected abnormal nitrate concentrations in the water body provoked by a dry summer which was followed by an extreme winter period. The case studies also show that similar to classical multivariate outlier detection tools, the semi‐automatic procedure allows the detection of suspicious observations lying at the edges as well as observations lying at the center of the univariate distribution of the observations, but, without having to impose linear relationships typically associated with these classical methods.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here