Comparison of selected methods for the retrieval of neologisms
Author(s) -
Piotr Paryzek
Publication year - 2008
Publication title -
investigationes linguisticae
Language(s) - English
Resource type - Journals
ISSN - 1426-188X
DOI - 10.14746/il.2008.16.14
Subject(s) - neologism , punctuation , lexis , natural language processing , computer science , artificial intelligence , linguistics , information retrieval , philosophy
The paper discusses and compares several semi-automatic methods used to extract neologisms from linguistic corpora. All the methods are based on the concept of discriminants, or textual features (both lexis and punctuation), that either precede (lexical discriminants) or confine (punctuation discriminants) phrases in which the occurrence of neologisms is higher than elsewhere in the text. Excerption and comparison was conducted on a corpus of 45 million words, articles from Nature scientific magazine. The putative neologisms were extracted using morphological analysis and frequency of their occurrence in the Google search engine. The result is a list of 1000 neologisms and assessment of the efficacy of each method.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom