
The Adequacy Assessment of Test Sets in Machine Learning using Mutation Testing
Author(s) -
Hyung Chul Yoon
Publication year - 2019
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.a1183.109119
Subject(s) - machine learning , computer science , artificial intelligence , set (abstract data type) , test (biology) , test set , training set , test data , data set , mutation , data mining , paleontology , biology , programming language , biochemistry , chemistry , gene
The accuracy is computed by applying the test dataset to the model that has been trained using the training dataset. Thus, The test dataset in machine learning is expected to be able to validate whether a trained model is sufficiently accurate for use. This study addresses this issue in the form of the research question, “how adequate is the test dataset used in machine learning models to validate the models.” To answer this question, the study takes seven most-popular datasets registered in the UCI machine learning data repository, and applies the data sets to the six difference machine learning models. We do an empirical study to analyze how adequate the test sets are, which are used in validating machine learning models. The testing adequacy for each model and each data set is analyzed by mutation analysis technique.