Bulgarian sense-annotated corpus – between the tradition and novelty
Author(s) -
Svetla Koeva
Publication year - 2015
Publication title -
cognitive studies | études cognitives
Language(s) - English
Resource type - Journals
eISSN - 2392-2397
pISSN - 2080-7147
DOI - 10.11649/cs.2012.012
Subject(s) - bulgarian , wordnet , annotation , computer science , natural language processing , schema (genetic algorithms) , root (linguistics) , synonym (taxonomy) , artificial intelligence , annotated bibliography , information retrieval , linguistics , philosophy , botany , biology , genus , library science
Bulgarian sense-annotated corpus – between the tradition and novelty The Bulgarian Sense-annotated Corpus (BulSemCor) is compiled according to the general methodology established by the SemCor project. It is a subset of the Brown Corpus of Bulgarian semantically annotated with a corresponding synonym set (synset) in the Bulgarian wordnet. Unlike the bulk of sense-annotated corpora where only (sets of) content words are annotated, in BulSemCor each lexical unit has been assigned a sense. The main contributions achieved in the work on BulSemCor are briefly decides in the presented paper: definition of an annotation schema, compilation of an input corpus, development of a sense-annotated corpus, Bulgarian wordnet enlargement.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom