Text Mining of Research Articles Using Clustering Approach | Zendy

Deepti Dominic | Zendy; R Jyothsna | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Text Mining of Research Articles Using Clustering Approach

Author(s) -

Deepti Dominic,

R Jyothsna

Publication year - 2021

Publication title -

international journal of advanced research in science communication and technology

Language(s) - English

Resource type - Journals

ISSN - 2581-9429

DOI - 10.48175/ijarsct-1350

Subject(s) - computer science , cluster analysis , tf–idf , support vector machine , random forest , classifier (uml) , data mining , vector space model , hierarchical clustering , artificial intelligence , information retrieval , pattern recognition (psychology) , term (time) , physics , quantum mechanics

Widening of research articles publication in various streams of research is epidemic. Tracking down of an appropriate article from the research archive is considered to be vast and also time consuming. Research articles are clustered based on their respective domain and it plays an important role for researchers to retrieve articles in a faster manner. Hence a commonly practiced search mechanism, namely domain name search has been applied to retrieve appropriate documents and articles. When new domains of documents are added to the repository it’s to spot keywords and boost the corresponding domains for proper classification. Classification techniques namely Random forest classifier, SVM and TF-IDF have been used to classify articles and compare its processing time. TF-IDF (Term Frequency-Inverse Document Frequency) has been further proposed to transform the corpus into vector space model. Clustering algorithm such as K-Means and Hierarchical have been used to cluster articles. Finally, the processing time of SVM is better than random forest classifier and TF-IDF and K-Means gives a better understanding than Hierarchical algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research