
Clustering of the Multi-Value Documents based on Probabilistic Features Association Mechanism
Author(s) -
P. Gopala Krishna,
D Lalitha Bhaskari
Publication year - 2019
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.a4538.119119
Subject(s) - cluster analysis , computer science , data mining , probabilistic logic , feature selection , similarity (geometry) , feature (linguistics) , clustering high dimensional data , artificial intelligence , categorization , multivariate statistics , machine learning , philosophy , linguistics , image (mathematics)
It is becoming increasingly difficult to cluster multi-valued data in data mining because of the multiple data interval values of individual functions. Identifying a clustering model that is appropriate for these disguised multi-valued data deployments in data analysis applications is an open problem. To answer this question, this paper proposes a feature selection based on the probabilistic features association mechanism (PFAM). The problem is mainly due to the difficulty in identifying the class information and the multiple values for each individual features. This work explores the problem of unsupervised feature selection through computing the probabilistic association score and multi-value data reformation for effective clustering in multivariate datasets. By minimizing a reformation clustering error, it can conserve together the degree of similarity and the categorization information of the actual data contents. The proposed approach is evaluated the clustering purity and Normalized Mutual Information on multivariate document datasets. The experimental evaluation shows the improvisation of the proposed approach.