
A Comparison of Bagging and Boosting on Classification Data: Case Study on Rainfall Data in Sultan Syarif Kasim II Meteorological Station in Pekanbaru
Author(s) -
Awais Adnan,
A M Yolanda,
F Natasya
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2049/1/012053
Subject(s) - boosting (machine learning) , gradient boosting , random forest , wind speed , decision tree , computer science , meteorology , machine learning , statistics , mathematics , artificial intelligence , geography
A frequent way for classification data is using a machine learning algorithm alongside ensemble methods like bagging and boosting. In earlier studies, these two algorithms have shown to be very accurate. The aim of this research is to discover performance of bagging and boosting to classify rainfall data obtained at the Sultan Syarif Kasim II Meteorological Station in Pekanbaru from 1 January 2018 until 31 July 2021. Rainfall data are classified into two categories: rainy and non-rainy. The parameters are average temperature, average humidity, sunshine duration, wind direction at maximum speed, and average wind speed. For comparison, this study developed Stochastic Gradient Boosting with Gradient Boosting Modelling and C5.0 from boosting, as well as Bagged Classification and Regression Tree (CART) and Random Forest from bagging. In order to generate reliable conclusions, each algorithm is run 30 times with repeated cross validation. The result demonstrates that Stochastic Gradient Boosting with Gradient Boosting Modelling is the best algorithm based on average accuracy.