A Fuzzy Approach for Text Mining
Author(s) -
Deepa B. Patil,
Yashwant Dongre
Publication year - 2015
Publication title -
international journal of mathematical sciences and computing
Language(s) - English
Resource type - Journals
eISSN - 2310-9033
pISSN - 2310-9025
DOI - 10.5815/ijmsc.2015.04.04
Subject(s) - cluster analysis , fuzzy clustering , data mining , computer science , single linkage clustering , correlation clustering , cure data clustering algorithm , flame clustering , pattern recognition (psychology) , clustering high dimensional data , fuzzy logic , canopy clustering algorithm , document clustering , cluster (spacecraft) , artificial intelligence , k medians clustering , programming language
Document clustering is an integral and important part of text mining. There are two types of clustering, namely, hard clustering and soft clustering. In case of hard clustering, data item belongs to only one cluster whereas in soft clustering, data point may fall into more than one cluster. Thus, soft clustering leads to fuzzy clustering wherein each data point is associated with a membership function that expresses the degree to which individual data points belong to the cluster. Accuracy is desired in information retrieval, which can be achieved by fuzzy clustering. In the work presented here, a fuzzy approach for text classification is used to classify the documents into appropriate clusters using Fuzzy C Means (FCM) clustering algorithm. Enron email dataset is used for experimental purpose. Using FCM clustering algorithm, emails are classified into different clusters. The results obtained are compared with the output produced by k means clustering algorithm. The comparative study showed that the fuzzy clusters are more appropriate than hard clusters.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom