z-logo
open-access-imgOpen Access
Automatic Processing of Natural-Language Electronic Texts with NooJ
Author(s) -
Tatsiana Okrut,
Yuras Hetsevich,
Max Silberztein,
Hanna Stanislavenka
Publication year - 2016
Publication title -
communications in computer and information science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.16
H-Index - 51
eISSN - 1865-0937
pISSN - 1865-0929
DOI - 10.1007/978-3-319-42471-2
Subject(s) - computer science , library science , natural (archaeology) , information retrieval , world wide web , natural language processing , history , archaeology
In this article the first one-million corpus for the Belarusian NooJ module is represented. The given corpus has been built up of texts, patched up into sections by different subject categories. From the broad list of possible subject categories in the sections the corpus focuses on fiction, historic, medical, scien‐ tific, sociological literature, etc. Given a great number of similar subject catego‐ ries, the first one-million corpus can be considered as a first subject collection of texts for the Belarusian NooJ module. The text corpus is expected to be suitable for research in the following aspects: word polysemy processing of various texts, polysemic punctuation marks processing, and a new lexical items search. The first one-million corpus for the Belarusian NooJ module can be fully applicable in many fields of linguistic research.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom