z-logo
open-access-imgOpen Access
THE PROBLEM OF ESTABLISHING THE COMPLETENESS OF THE COVERAGE OF THE DISSERTATION RESEARCH RESULTS BY GRADUATES
Author(s) -
Петро Лізунов,
Andrii Biloshchytskyi,
Alexander Kuchansky,
Yurii Andrashko,
Tamara Liashchenko
Publication year - 2021
Publication title -
upravlìnnâ rozvitkom skladnih sistem
Language(s) - English
Resource type - Journals
eISSN - 2412-9933
pISSN - 2219-5300
DOI - 10.32347/2412-9933.2021.47.102-108
Subject(s) - completeness (order theory) , computer science , probabilistic logic , thematic map , set (abstract data type) , thematic structure , information retrieval , data science , artificial intelligence , natural language processing , mathematics , programming language , mathematical analysis , cartography , geography
The paper describes the possibilities of applying latent semantic analysis to identify the completeness of the coverage of the results of dissertation research by applicants for scientific degrees. To achieve this goal, the following tasks were set and achieved: a review of the probabilistic thematic model of presentation of text documents, in particular, scientific papers using specific subject terms, which are represented by n-grams; a formal description of the probabilistic thematic model for the problem of establishing the completeness of the coverage of the author's dissertation research materials in his scientific articles is given. A feature of the probabilistic thematic model for the problem of establishing the completeness of the coverage of the author's dissertation research materials in his scientific publications is training and a special regularizer. The result of the model is a matrix of belonging of the topics, which are determined by the segments of the author's dissertation abstracts to the documents, which are determined by the author's publications. The application of this model to this problem has not yet been described. The problem considered in the paper is based on the issue of maximizing the likelihood function, which is incorrectly posed. Only the appropriate regularizers are used to reduce the task to the correct one. Other methods of reducing tasks to the correct ones were not considered. A limitation of the study is the problem of the canonization of texts in different languages. This study uses textual information in the Ukrainian language. In further research, the reduction of texts to one language base will be offered. In particular, because the tools of canonization of English texts have more opportunities, particularly for scientific publications. Also, a limitation is the difficulty of obtaining full texts of dissertations for complete verification of the model. The research results are combined with the system of detection of incomplete duplicates in scientific documents, particularly dissertations for the degree.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here