z-logo
open-access-imgOpen Access
AN ABSTRACT MODEL OF SEARCH INDEX QUERY IN THE RUSSIAN NATIONAL CORPUS
Author(s) -
Д. А. Морозов,
S. Gladilin
Publication year - 2020
Publication title -
kompʹûternaâ lingvistika i intellektualʹnye tehnologii
Language(s) - English
Resource type - Conference proceedings
ISSN - 2075-7182
DOI - 10.28995/2075-7182-2020-19-1109-1116
Subject(s) - code refactoring , computer science , index (typography) , query expansion , query optimization , sargable , query language , web search query , basis (linear algebra) , code (set theory) , web query classification , scheme (mathematics) , information retrieval , data mining , programming language , search engine , software , mathematics , mathematical analysis , geometry , set (abstract data type)
The paper discusses the so-called “bag problem,” which affects the search accuracy in the Russian National Corpus (RNC). Solving the problem requires a change of the search index data scheme used in RNC, which in its turn requires a significant refactoring of the RNC program code. The basis of such a refactoring is proposed to be an abstract model of the search index query, which allows us to separate the query formation from the query implementation. An experiment was carried out in which one of the RNC system program modules was decomposed, which confirmed sufficient expressiveness of the constructed model. Directions of further work are determined.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here