Mechanisms for optimization of detection and correction of text errors based on combining multilevel morphological analysis with n-gram models
Author(s) -
Isroil I. Jumanov,
Karshiev Khusan
Publication year - 2020
Publication title -
journal of physics conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1546/1/012082
Subject(s) - computer science , redundancy (engineering) , spelling , natural language processing , reliability (semiconductor) , artificial intelligence , basis (linear algebra) , representation (politics) , knowledge base , set (abstract data type) , information retrieval , mathematics , linguistics , philosophy , power (physics) , physics , geometry , quantum mechanics , politics , political science , law , programming language , operating system
In the article the problem of increasing the information reliability in electronic document management systems is formulated, and mechanisms for controlling and correcting spelling and errors with semantic values are developed on the basis of a combined multilevel morphological analysis with n-gram models, a typical search, recognition, and classification tools. Mechanisms for verifying the spelling of a word on the basis of a vector representation of variables and comparison with a standard analogue are proposed according to the principles of using statistical, natural, structural, technological, semantic information redundancy. The solutions to the problems of increasing the information reliability based on a set of keywords, phrases, terms by comparing with virtual, frequency dictionaries located in the electronic document database and knowledge base are obtained. A technique has been developed to optimize control mechanisms and correct spelling errors based on the use of logical, semantic and structural - technological links, cross-relationships between individual or groups of words, phrases in the text information. The obtained tools to increase the reliability of the texts of electronic documents are tested in real condition, the results are compared with the conclusions of the system experts.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom