Outlier Reduction using Hybrid Approach in Data Mining | Zendy

Nancy Lekhi | Zendy; Manish Mahajan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Outlier Reduction using Hybrid Approach in Data Mining

Author(s) -

Nancy Lekhi,

Manish Mahajan

Publication year - 2015

Publication title -

international journal of modern education and computer science

Language(s) - English

Resource type - Journals

eISSN - 2075-017X

pISSN - 2075-0161

DOI - 10.5815/ijmecs.2015.05.06

Subject(s) - outlier , computer science , anomaly detection , data mining , cluster analysis , artificial neural network , pattern recognition (psychology) , k means clustering , data set , set (abstract data type) , reduction (mathematics) , artificial intelligence , cluster (spacecraft) , mathematics , geometry , programming language

The Outlier detection is very active area of research in data mining where outlier is a mismatched data in dataset with respect to the other available data. In existing approaches the outlier detection done only on numeric dataset. For outlier detection if we use clustering method , then they mainly focus on those elements as outliers which are lying outside the clusters but it may possible that some of the unknown elements with any possible reasons became the part of the cluster so we have to concentrate on that also. The Proposed method uses hybrid approach to reduce the number of outliers. The number of outlier can only reduce by improving the cluster formulation method. The proposed method uses two data mining techniques for cluster formulation i.e. weighted k-means and neural network where weighted k- means is the clustering technique that can apply on text and date data set as well as numeric data set. Weighted k- means assign the weights to each element in dataset. The output of weighted k-means becomes the input for neural network where the neural network is the classification and clustering technique of data mining. Training is provided to the neural network and according to that neurons performed the testing. The neural network test the cluster formulated by weighted k-means to ensure that the clusters formulated by weighted k-means are group accordingly. There is lots of outlier detection methods present in data mining. The proposed method use Integrating Semantic Knowledge (SOF) for outlier detection. This method detects the semantic outlier where the semantic outlier is a data point that behaves differently with other data points in the same class or cluster. The main motive of this research work is to reduce the number of outliers by improving the cluster formulation methods so that outlier rate reduces and also to decrease the mean square error and improve the accuracy. The simulation result clearly shows that proposed method works pretty well as it significantly reduces the outlier. Index Terms—Data Mining, Clustering, Weighted K- means, Neural Network, Outlier, and SOF

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research