z-logo
open-access-imgOpen Access
Imputation Methods for Missing Data for a Proposed VASA Dataset
Author(s) -
Asst Professor
Publication year - 2019
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.a5204.119119
Subject(s) - missing data , imputation (statistics) , data mining , computer science , mean squared error , data pre processing , raw data , preprocessor , principal component analysis , statistics , pattern recognition (psychology) , artificial intelligence , mathematics , machine learning
Preprocessing is the presentation of raw data before apply the actual statistical method. Data preprocessing is one of the most vital steps in data mining process and it deals with the preparation and transformation of the initial dataset. It is prominent because the investigating data which is not properly preprocessed could lead to the result which is not accurate and meaningless. Almost every research have missing data and introduce an element into data analysis using some method. To consider the missing values that need to provide an efficient and valid analysis. Missing imputation is one of the process in data cleaning. Here, four different types of imputation methods are compared: Mean, Singular Value Decomposition (SVD), K-Nearest Neighbors (KNN), Bayesian Principal Component Analysis (BPCA). Comparison was performed in the real VASA dataset and based on performance evaluation criteria such as Mean Square Error (MSE) and Root Mean Square Error (RMSE). BPCA is the best imputation method of interest which deserve further consideration in practice.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here