Comparing K-Value Estimation for Categorical and Numeric Data Clustring | Zendy

K. Arunprabha | Zendy; V. Bhuvaneswari | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Comparing K-Value Estimation for Categorical and Numeric Data Clustring

Author(s) -

K. Arunprabha,

V. Bhuvaneswari

Publication year - 2010

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/1565-1875

Subject(s) - categorical variable , computer science , estimation , value (mathematics) , statistics , data mining , machine learning , mathematics , management , economics

In Data mining, Clustering is one of the major tasks and aims at grouping the data objects into meaningful classes (clusters) such that the similarity of objects within clusters is maximized, and the similarity of objects from different clusters is minimized. When clustering a dataset, the right number k of clusters to use is often not obvious, and choosing k automatically is a hard algorithmic problem. We used an improved algorithm for learning k while clustering the Categorical clustering. A Clustering algorithm Gaussian means applied in k-means paradigm that works well for categorical features. For applying Categorical dataset to this algorithm, converting it into numeric dataset. In this paper we present a Heuristic novel techniques are used for conversion and comparing the categorical data with numeric data. The Gmeans algorithm is based on a statistical test for the hypothesis that a subset of data follows a Gaussian distribution. G-means runs in k-means with increasing k in a hierarchical fashion until the test accepts the hypothesis that the data assigned to each k-means center are Gaussian. Gmeans only requires one intuitive parameter, the standard statistical significance level α.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research