
DIFFERENTIAL SEMANTIC SKETCHES FOR RUSSIAN INTERNET-CORPORA
Author(s) -
Julia Detkova,
Abbyy Lab Mipt,
Valeriy I. Novitskiy,
Maria Petrova,
V. Selegey,
Abbyy
Publication year - 2020
Publication title -
kompʹûternaâ lingvistika i intellektualʹnye tehnologii
Language(s) - English
Resource type - Conference proceedings
ISSN - 2075-7182
DOI - 10.28995/2075-7182-2020-19-211-227
Subject(s) - computer science , parsing , natural language processing , markup language , the internet , artificial intelligence , word (group theory) , semantic role labeling , semantics (computer science) , information retrieval , representation (politics) , linguistics , world wide web , xml , programming language , philosophy , politics , political science , law , sentence
The current paper suggests a new representation type of word collocations—the semantic sketches. It was first tested on one of the subcorpora of the General Internet-Corpus of Russian. The semantic sketches continue the idea of word sketches based on grammatical relations between words and expand it by adding the semantic information—word meanings and semantic relations between words. Moreover, the sketches can be additionally provided with metatextual characteristics. Certainly, building such sketches demands the semantic markup of the corpora. Therefore, we have used partial semantic analysis of the Compreno parser for our purposes. The paper demonstrates the examples of the sketches, provides the quality evaluation of the markup they are based on, and shows the advantages and disadvantages of the given approach.