Role of Pre-processing Phase in Document Clustering Technique for Gurmukhi Script | Zendy

Mukesh Kumar | Zendy; Amandeep Verma | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Role of Pre-processing Phase in Document Clustering Technique for Gurmukhi Script

Author(s) -

Mukesh Kumar,

Amandeep Verma

Publication year - 2020

Publication title -

international journal of innovative technology and exploring engineering

Language(s) - English

Resource type - Journals

ISSN - 2278-3075

DOI - 10.35940/ijitee.c9105.019320

Subject(s) - computer science , cluster analysis , document clustering , artificial intelligence , normalization (sociology) , document processing , pattern recognition (psychology) , natural language processing , data mining , sociology , anthropology

Document clustering plays a central role in knowledge discovery and data mining by representing large data-sets into a certain number of data objects called clusters. Each cluster consists similar data objects in such a way that data objects in the same cluster are more similar and dissimilar to the data objects of other clusters. Document clustering technique for Gurmukhi script consists two phases namely: 1) Pre-processing phase 2) Processing phase. This paper concentrates pre-processing phase of document clustering technique for Gurmukhi script. The purpose of pre-processing phase is to convert unstructured text into structured text format. Various sub-phases of pre-processing phase are: segmentation, tokenization, removal of stop words, stemming, and normalization. The purpose of this paper is to present the significant role of pre-processing phase in an overall performance of document clustering technique for Gurmukhi script. The experimental results represent the significant role of pre-processing phase in terms of performance regarding assignment of data objects to the relevant clusters as well as in creation of meaningful cluster title list. .

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research