Premium
Exploring Learner Language Through Corpora: Comparing and Interpreting Corpus Frequency Information
Author(s) -
Gablasova Dana,
Brezina Vaclav,
McEnery Tony
Publication year - 2017
Publication title -
language learning
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.882
H-Index - 103
eISSN - 1467-9922
pISSN - 0023-8333
DOI - 10.1111/lang.12226
Subject(s) - corpus linguistics , comparability , computer science , representativeness heuristic , natural language processing , linguistics , text corpus , artificial intelligence , set (abstract data type) , british national corpus , psychology , social psychology , philosophy , mathematics , combinatorics , programming language
This article contributes to the debate about the appropriate use of corpus data in language learning research. It focuses on frequencies of linguistic features in language use and their comparison across corpora. The majority of corpus‐based second language acquisition studies employ a comparative design in which either one or more second language (L2) corpora are compared to a first language (L1) production corpus or two or more L2 corpora are compared to each other. This article critically examines some of the central tenets of the comparative method related to the interspeaker variation in L1 and L2 use, the representativeness and comparability of corpus data, the interpretation of difference found between corpora and the appropriate use of statistics. Using and discussing a set of five L1 spoken English corpora and three L2 English corpora (two spoken and one written), we approach these areas empirically exploring different sources of variations and methodological options that corpus‐based SLA studies offer.