z-logo
open-access-imgOpen Access
Optimizing random forest classifier with Jenesis-index on an imbalanced dataset
Author(s) -
Joylin Zeffora,
Shobarani Shobarani
Publication year - 2022
Publication title -
indonesian journal of electrical engineering and computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.241
H-Index - 17
eISSN - 2502-4760
pISSN - 2502-4752
DOI - 10.11591/ijeecs.v26.i1.pp505-511
Subject(s) - random forest , gini coefficient , classifier (uml) , index (typography) , data mining , decision tree , measure (data warehouse) , computer science , statistics , ensemble learning , artificial intelligence , feature (linguistics) , mathematics , machine learning , inequality , economic inequality , mathematical analysis , world wide web , linguistics , philosophy
Random  forest is an ensemble algorithm for machine learning. In decision trees, the splitting criteria is built on the prediction of the nodal points and formation of rules by Gini index and Information Gain. Gini index is a measure of inequality. Gini index does not take into consideration the structural changes in the dataset, and inaccurate data can distort the validity of the gini-coefficient. For data with the same feature but different outcomes, the gini-coefficient remained the same. The proposed method for attribute selection measure takes into consideration that there may be structural changes in the dataset overtime and it adapts to such expected changes and maintain the accuracy of the algorithm avoiding under-fitting and over-fitting. A dataset on myocardial infarctions was taken for the study and the results were promising.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here