A Frequent Concepts Based Document Clustering Algorithm | Zendy

Rekha Baghel | Zendy; Renu Dhir | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Frequent Concepts Based Document Clustering Algorithm

Author(s) -

Rekha Baghel,

Renu Dhir

Publication year - 2010

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/826-1171

Subject(s) - computer science , cluster analysis , document clustering , wordnet , data mining , exploit , canopy clustering algorithm , brown clustering , scalability , information retrieval , artificial intelligence , correlation clustering , database , computer security

This paper presents a novel technique of document clustering based on frequent concepts. The proposed technique, FCDC (Frequent Concepts based document clustering), a clustering algorithm works with frequent concepts rather than frequent items used in traditional text mining techniques. Many well known clustering algorithms deal with documents as bag of words and ignore the important relationships between words like synonyms. the proposed FCDC algorithm utilizes the semantic relationship between words to create concepts. It exploits the WordNet ontology in turn to create low dimensional feature vector which allows us to develop a efficient clustering algorithm. It uses a hierarchical approach to cluster text documents having common concepts. FCDC found more accurate, scalable and effective when compared with existing clustering algorithms like Bisecting K-means , UPGMA and FIHC.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research