z-logo
open-access-imgOpen Access
Analysis of the Effect of Data Scaling on the Performance of the Machine Learning Algorithm for Plant Identification
Author(s) -
Agus Ambarwari,
Qadhli Jafar Adrian,
Yeni Herdiyeni
Publication year - 2020
Publication title -
jurnal resti (rekayasa sistem dan teknologi informasi)
Language(s) - English
Resource type - Journals
ISSN - 2580-0760
DOI - 10.29207/resti.v4i1.1517
Subject(s) - normalization (sociology) , support vector machine , computer science , artificial intelligence , algorithm , database normalization , preprocessor , standardization , machine learning , data pre processing , naive bayes classifier , data mining , pattern recognition (psychology) , sociology , anthropology , operating system
Data scaling has an important role in preprocessing data that has an impact on the performance of machine learning algorithms. This study aims to analyze the effect of min-max normalization techniques and standardization (zero-mean normalization) on the performance of machine learning algorithms. The stages carried out in this study included data normalization on the data of leaf venation features. The results of the normalized dataset, then tested to four machine learning algorithms include KNN, Naïve Bayesian, ANN, SVM with RBF kernels and linear kernels. The analysis was carried out on the results of model evaluations using 10-fold cross-validation, and validation using test data. The results obtained show that Naïve Bayesian has the most stable performance against the use of min-max normalization techniques as well as standardization. The KNN algorithm is quite stable compared to SVM and ANN. However, the combination of the min-max normalization technique with SVM that uses the RBF kernel can provide the best performance results. On the other hand, SVM with a linear kernel, the best performance is obtained when applying standardization techniques (zero-mean normalization). While the ANN algorithm, it is necessary to do a number of trials to find out the best data normalization techniques that match the algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here