A New Paradigm for Development of Data Imputation Approach for Missing Value Estimation | Zendy

G. Madhu | Zendy; G. Nagachandrika | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A New Paradigm for Development of Data Imputation Approach for Missing Value Estimation

Author(s) -

G. Madhu,

G. Nagachandrika

Publication year - 2016

Publication title -

international journal of electrical and computer engineering (ijece)

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.277

H-Index - 22

ISSN - 2088-8708

DOI - 10.11591/ijece.v6i6.pp3222-3228

Subject(s) - missing data , imputation (statistics) , computer science , data mining , centroid , curse of dimensionality , cluster analysis , artificial intelligence , machine learning

Many real-world applications encountered a common issue in data analysis is the presence of missing data value and challenging task in many applications such as wireless sensor networks, medical applications and psychological domain and others. Learning and prediction in the presence of missing value can be treacherous in machine learning, data mining and statistical analysis. A missing value can signify important information about dataset in the mining process. Handling missing data value is a challenging task for the data mining process. In this paper, we propose new paradigm for the development of data imputation method for missing data value estimation based on centroids and the nearest neighbours. Firstly, identify clusters based on the k-means algorithm and calculate centroids and the nearest neighbour data records. Secondly, the nearest distances from complete dataset as well as incomplete dataset from the centroids and estimated the nearest data record which tends to be curse dimensionality. Finally, impute the missing value based nearest neighbour record using statistical measure called z-score. The experimental study demonstrates strengthen of the proposed paradigm for the imputation of the missing data value estimation in dataset. Tests have been run using different types of datasets in order to validate our approach and compare the results with other imputation methods such as KNNI, SVMI, WKNNI, KMI and FKNNI. The proposed approach is geared towards maximizing the utility of imputation with respect to missing data value estimation.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research