
Blind Source Separation for Text Mining
Author(s) -
Abdelghani Ghazdali,
Abdelmoutalib Metrane,
Amal Ourdou
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1743/1/012018
Subject(s) - computer science , blind signal separation , weighting , independent component analysis , cluster analysis , data mining , document clustering , artificial intelligence , information retrieval , separation (statistics) , natural language processing , pattern recognition (psychology) , machine learning , medicine , computer network , channel (broadcasting) , radiology
Blind Source Separation (BSS) was originally developed for signal processing applications. It has been proven out that Independent Component Analysis (ICA) which is the technique used for separating independent sources, is a powerful tool for analyzing text document data as well, if the text documents are presented in a suitable numerical form. This opens up new possibilities for automatic analysis of large textual data bases: detecting the topics present in the corpus and grouping the documents accordingly or in other words Clustering documents, hence achieving two tasks of Text Mining at the same time using only one algorithm. In our study we use an appropriate BSS approach along with new weighting distance to transform the textual data to achieve higher level of accuracy.