
Koder - A multi-register corpus for investigating register variation in contemporary German
Author(s) -
Andressa Costa
Publication year - 2019
Publication title -
research in corpus linguistics
Language(s) - English
Resource type - Journals
ISSN - 2243-4712
DOI - 10.32714/ricl.07.04
Subject(s) - register (sociolinguistics) , german , variation (astronomy) , computer science , construct (python library) , natural language processing , corpus linguistics , artificial intelligence , linguistics , programming language , philosophy , physics , astrophysics
This paper introduces the design decisions in building the Koder corpus, a multi-register-corpus of contemporary German. The purpose of this corpus is to serve as a basis for the investigation into the use of German across registers. In order to construct a representative corpus, the essential considerations are: the type and number of registers to include, the number of texts in each register and minimal text length. The paper describes which aspects were central in determining these issues as well the corpus composition and the necessary text processing.