z-logo
open-access-imgOpen Access
Derivation of effective and efficient data set with subtractive clustering method and genetic algorithm
Author(s) -
Chi Dung Doan,
S. Y. Liong,
Dulakshi S. K. Karunasinghe
Publication year - 2005
Publication title -
journal of hydroinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.654
H-Index - 50
eISSN - 1465-1734
pISSN - 1464-7141
DOI - 10.2166/hydro.2005.0020
Subject(s) - outlier , data mining , algorithm , cluster analysis , data set , set (abstract data type) , computer science , chaotic , artificial neural network , raw data , fuzzy logic , series (stratigraphy) , genetic algorithm , time series , artificial intelligence , machine learning , paleontology , biology , programming language
Success of any forecasting model depends heavily on reliable historical data, among others. Data are needed to calibrate, fine tune and verify any simulation model. However, data are very often contaminated with noise of different levels originating from different sources. This study proposes a scheme that extracts the most representative data from a raw data set. Subtractive Clustering Method (SCM) and Micro Genetic Algorithm (mGA) were used for this purpose. SCM does (a) remove outliers and (b) discard unnecessary or superfluous points while mGA, a search engine, determines the optimal values of the SCM's parameter set. The scheme was demonstrated in: (1) Bangladesh water level forecasting with Neural Network and Fuzzy Logic and (2) forecasting of two chaotic river flow series (Wabash River at Mt. Carmel and Mississippi River at Vicksburg) with the phase space prediction method. The scheme was able to significantly reduce the data set with which the forecasting models yield either equally high or higher prediction accuracy than models trained with the whole original data set. The resulting fuzzy logic model, for example, yields a smaller number of rules which are easier for human interpretation. In phase space prediction of chaotic time series, which is known to require a long data record, a data reduction of up to 40% almost does not affect the prediction accuracy.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom