Open Access
Arabic Documents Classification by a Radial Basis Hybridization
Author(s) -
Taher Zaki,
Driss Mammass,
Abdellatif Ennaji,
Stéphane Nicolas
Publication year - 2021
Publication title -
international journal of mathematical models and methods in applied sciences
Language(s) - English
Resource type - Journals
ISSN - 1998-0140
DOI - 10.46300/9101.2021.15.18
Subject(s) - computer science , search engine indexing , artificial intelligence , natural language processing , semantic similarity , arabic , kernel (algebra) , similarity (geometry) , information retrieval , pattern recognition (psychology) , mathematics , linguistics , philosophy , combinatorics , image (mathematics)
In this paper, we propose a hybrid system for contextual and semantic indexing of Arabic documents, bringing an improvement to classical models based on n-grams and the Okapi model. This new approach takes into account the concept of the semantic vicinity of terms. We proceed in fact by the calculation of similarity between words using an hybridization of NGRAMs-OKAPI statistical measures and a kernel function in order to identify relevant descriptors. Terminological resources such as graphs and semantic dictionaries are integrated into the system to improve the indexing and the classification processes.