Premium
A data‐driven approach to develop physically sound predictors: Application to depth‐averaged velocities on flows through submerged arrays of rigid cylinders
Author(s) -
Tinoco R. O.,
Goldstein E. B.,
Coco G.
Publication year - 2015
Publication title -
water resources research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.863
H-Index - 217
eISSN - 1944-7973
pISSN - 0043-1397
DOI - 10.1002/2014wr016380
Subject(s) - range (aeronautics) , process (computing) , computer science , set (abstract data type) , flow (mathematics) , channel (broadcasting) , simple (philosophy) , machine learning , artificial intelligence , data set , genetic programming , data mining , mathematics , engineering , geometry , computer network , philosophy , epistemology , programming language , aerospace engineering , operating system
We use a machine learning approach to seek an accurate, physically sound predictor, to estimate the mean velocity for open‐channel flow when submerged arrays of rigid cylinders (model vegetation) are present. A genetic programming routine is used to find a robust relationship between relevant properties of the model vegetation and flow parameters. We use published data from laboratory experiments covering a broad range of conditions to obtain an equation that matches the performance of other predictors from recent literature in terms of accuracy, while showing a less complex structure. We also investigate how different criteria for data selection, as well as the size of the data set used to train the algorithm, influences the accuracy of the resulting predictors. Our results show that a proper use of Machine‐Learning techniques does not only provide empirical correlations, but can yield physically sound models as representative of the physical processes involved. We provide a clear, thorough example of the application of GP, its advantages and shortcomings, to encourage the use of data‐driven techniques as part of the data analysis process, and to address common misconceptions of machine learning as simple correlation techniques or physically senseless statistical analysis.