z-logo
open-access-imgOpen Access
Scientific Documents clustering based on Text Summarization
Author(s) -
Pedram Vahdani Amoli,
Omid Sojoodi Sh.
Publication year - 2015
Publication title -
international journal of electrical and computer engineering (ijece)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.277
H-Index - 22
ISSN - 2088-8708
DOI - 10.11591/ijece.v5i4.pp782-787
Subject(s) - tf–idf , cluster analysis , automatic summarization , computer science , preprocessor , document clustering , sentence , term (time) , data mining , information retrieval , artificial intelligence , physics , quantum mechanics
In this paper a novel method is proposed for scientific document clustering. The proposed method is a summarization-based hybrid algorithm which comprises a preprocessing phase. In the preprocessing phase unimportant words which are frequently used in the text are removed. This process reduces the amount of data for the clustering purpose. Furthermore frequent items cause overlapping between the clusters which leads to inefficiency of the cluster separation. After the preprocessing phase, Term Frequency/Inverse Document Frequency (TFIDF) is calculated for all words and stems over the document to score them in the document. Text summarization is performed then in the sentence level. Document clustering is finally done according to the scores of calculated TFIDF. The hybrid progress of the proposed scheme, from preprocessing phase to document clustering, gains a rapid and efficient clustering method which is evaluated by 400 English texts extracted from scientific databases of 11 different topics. The proposed method is compared with CSSA, SMTC and Max-Capture methods. The results demonstrate the proficiency of the proposed scheme in terms of computation time and efficiency using F-measure criterion.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom