
Performance appraisal of validation techniques in R
Author(s) -
M. Iqbal Jeelani Bhat,
Manish Sharma,
Khalid-ul Islam,
Rizwan Yousuf,
Zakir Hussain
Publication year - 2020
Publication title -
international research journal of agricultural economics and statistics
Language(s) - English
Resource type - Journals
eISSN - 2231-6434
pISSN - 2229-7278
DOI - 10.15740/has/irjaes/11.2/260-268
Subject(s) - mean squared error , cross validation , model selection , statistics , mathematics , bayesian information criterion , computer science
In this article various statistical models were fitted utilizing simulated symmetric and asymmetric data. Fitting of models were carried out with the help of various libraries like minpack.lm, matrices and nlme in R studio (version 3.5.1, 2018) and various selection criteria like RMSE, MAE, AIC, BIC were used for fitting of models. In order to evaluate different validation techniques the simulated data was divided in training and testing data sets and various functions in R were developed for the purpose of validation. Co-efficient summary revealed that all statistical models were statistically significant across both symmetric as well as asymmetric distributions. In preliminary analysis TFEM (Type First Exponential Model) was found out to be the best linear model across the distributions with lower values of RMSE, MAE, BIAS, AIC and BIC. Among non-linear models, Haung model was found out to be best model across both the distributions as it has lower values of RMSE, MAE etc. Different validation techniques like Half splitting, LOOCV and 5-folded cross validation were used in the present study. Based on the results of evaluation 5-folded cross validation performed better, as it resulted in lower rates of prediction error in comparison to its counter parts.