
Combination of ADASYN-N and Random Forest in Predicting of Obesity Status in Indonesia: A Case Study of Indonesian Basic Health Research 2013
Author(s) -
Muhammad Aqsha,
SA Thamrin,
Armin Lawi
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2123/1/012039
Subject(s) - random forest , obesity , indonesian , computer science , medicine , statistics , mathematics , environmental health , machine learning , linguistics , philosophy
Obesity is a pathological condition due to the accumulation of excessive fat needed for body functions. The risk factors for obesity are related to their obesity status. Various machine learning approaches are an alternative in predicting obesity status. However, in most cases, the available datasets are not sufficiently balanced in their data classes. The existence of data imbalances can cause the prediction results to be inaccurate. The purpose of this paper is to overcome the problem of data class imbalance and predict obesity status using the 2013 Indonesian Basic Health Research (RISKESDAS) data. Adaptive Synthetic Nominal (ADASYN-N) can be used to balance obesity status data. The balanced obesity status data is then predicted using one of the machine learning approaches, namely Random Forest. The results obtained show that through ADASYN-N with a balance level parameter of 1 (β = 100%) after synthetic data generation and Random Forest with a tree number of 200 and involving 7 variables as risk factors, giving the results of the classification of obesity status which is good. This can be seen from the AUC value of 84.41%.