
A new Approach of Documents Indexing Using subject modelling and Summarization
Author(s) -
Latifa Ouadif,
Rachid El Ayachi,
Mohamed Biniz
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1743/1/012032
Subject(s) - search engine indexing , automatic summarization , computer science , information retrieval , automatic indexing , weighting , field (mathematics) , process (computing) , subject (documents) , representation (politics) , point (geometry) , data mining , world wide web , programming language , medicine , mathematics , geometry , politics , political science , pure mathematics , law , radiology
Document indexing is a field of research in Natural Language Processing (NLP) that has been rapidly evolving for 70 years. It is an operation that focuses on the synthetic representation of a document according to a model in order to facilitate their subsequent use. This work is concerned with document indexing. Two points are addressed. This work is concerned with document indexing, we are trying to accelerate the indexing process of large document datasets, two points are addressed. The first one concerns the development of a document indexing system using the system's operating process based on three phases namely pre-processing, weighting, and subject modelling. The second point concerns the proposal for a new system that integrates a new developed automatic summary subsystem, the goal of this point is to minimize indexing time.