Sewer Deterioration Modeling: The Effect of Training a Random Forest Model on Logically Selected Data-groups | Zendy

Bolette D. Hansen | Zendy; S. H. Rasmussen | Zendy; Thomas B. Moeslund | Zendy; Mads Uggerby | Zendy; David G. Jensen | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Sewer Deterioration Modeling: The Effect of Training a Random Forest Model on Logically Selected Data-groups

Author(s) -

Bolette D. Hansen,

S. H. Rasmussen,

Thomas B. Moeslund,

Mads Uggerby,

David G. Jensen

Publication year - 2020

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2020.08.031

Subject(s) - sanitary sewer , computer science , set (abstract data type) , scope (computer science) , training (meteorology) , data set , training set , position (finance) , artificial intelligence , environmental science , environmental engineering , physics , finance , meteorology , economics , programming language

Breakdown of sewers can induce significantly damage to roads and buildings placed upon it. For this reason, timely maintenance of the sewer system is essential. However, due to the under-ground position of the sewers they are very expensive to monitor, as this is done by CCTV inspection. Therefore, it is important to choose the right sewers for inspection and several decision-support tools have been developed to help the operators to select which sewers to inspect. These decision support tools all contain a model which predicts the condition of the sewers, and recently several models have been proposed in order to increase the performance. The scope of this paper is to investigate the effect of training a Random Forest model on logically selected groups of data, as opposed to training of a joined model on the full data set. The selected data groups were based on expert knowledge: The first data groups were based on the sewer material (concrete, plastic, clay, reinforced with lining and other material). The concrete data set was then further sub-divided into wastewater types (sewage, rain and combined) whereas the plastic data set was sub-divided into road classes. The results showed that the model trained on the full data set performed better than the models trained on logically selected data-groups as it encounters the heterogeneity of the data set. Furthermore, this answers an important question raised by end users of the deterioration models.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research