z-logo
open-access-imgOpen Access
Similarity Measure Algorithm for Text Document Clustering, Using Singular Value Decomposition
Author(s) -
Valentina Adu,
Michael Donkor Adane,
Kwadwo Asante
Publication year - 2021
Publication title -
current journal of applied science and technology
Language(s) - English
Resource type - Journals
ISSN - 2457-1024
DOI - 10.9734/cjast/2021/v40i2231475
Subject(s) - singular value decomposition , document clustering , rank (graph theory) , cluster analysis , similarity (geometry) , computer science , matrix (chemical analysis) , data mining , measure (data warehouse) , dimension (graph theory) , closeness , set (abstract data type) , information retrieval , algorithm , mathematics , artificial intelligence , combinatorics , mathematical analysis , materials science , composite material , image (mathematics) , programming language
We examined a similarity measure between text documents clustering. Data mining is a challenging field with more research and application areas. Text document clustering, which is a subset of data mining helps groups and organizes a large quantity of unstructured text documents into a small number of meaningful clusters. An algorithm which works better by calculating the degree of closeness of documents using their document matrix was used to query the terms/words in each document. We also determined whether a given set of text documents are similar/different to the other when these terms are queried. We found that, the ability to rank and approximate documents using matrix allows the use of Singular Value Decomposition (SVD) as an enhanced text data mining algorithm. Also, applying SVD to a matrix of a high dimension results in matrix of a lower dimension, to expose the relationships in the original matrix by ordering it from the most variant to the lowest.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here