Open Access
Phraseografie und Korpusanalyse
Author(s) -
Sören Stumpf
Publication year - 2019
Publication title -
linguistik online
Language(s) - English
Resource type - Journals
ISSN - 1615-3014
DOI - 10.13092/lo.96.5523
Subject(s) - lemmatisation , computer science , natural language processing , lexicographical order , phrase , german , artificial intelligence , point (geometry) , linguistics , mathematics , philosophy , geometry , combinatorics
The following article gives an overview of the weak points in the lexicographical coverage of phrasemes. The main problem with previous phraseography is that the dictionary entries are not based on comprehensive corpus analyses of actual language use. Hence I make a case for a “corpus-based phraseography” (Steyer 2010) and in using selected examples, I demonstrate how a pragmatic approach that is focused on actual language use can help to improve the lemmatization of formulaic expressions. This also shows which consequences and changes may occur from a corpus-analytical point of view as compared to the traditional phraseographical approach. For this purpose, I use the German reference corpus/Deutsches Referenzkorpus and the analysis system COSMAS II. Central to my analysis are the phenomena that have scarcely received any attention: the differentiation of modifications and phrase schemata, the valence spectrum of phrasemes as well as formulaic expressions with unique components.