
An Investigation of the Effect of Task Type on the Discourse Produced by Students at Various Score Levels in the TOEFL iBT ® Writing Test
Author(s) -
Knoch Ute,
Macqueen Susy,
O'Hagan Sally
Publication year - 2014
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/ets2.12038
Subject(s) - test of english as a foreign language , psychology , test (biology) , fluency , writing assessment , active listening , task (project management) , argument (complex analysis) , mathematics education , cohesion (chemistry) , metadiscourse , multivariate analysis of variance , linguistics , rating scale , computer science , language assessment , developmental psychology , communication , paleontology , biochemistry , chemistry , philosophy , management , organic chemistry , machine learning , economics , biology
This study, which forms part of the TOEFL iBT ® test validity argument for the writing section, has two main aims: to verify whether the discourse produced in response to the independent and integrated writing tasks differs and to identify features of written discourse that are typical of different scoring levels. The integrated writing task was added to the TOEFL iBT test to “improve the measurement of test‐takers' writing abilities, create positive washback on teaching and learning as well as require test‐takers to write in ways that are more authentic to academic study” (Cumming et al., 2006, p. 1). However, no research since the study by Cumming et al. (2006) on the prototype tasks has investigated if the discourse produced in response to this new integrated reading/listening‐to‐write task is in fact different from that produced in response to the independent task. Finding such evidence in the discourse is important, as it adds to the validity argument of the TOEFL iBT writing test and is useful for a verification of the rating scale descriptors used in operational rating. This study applied discourse‐analytic measures to the writing of 480 test takers who each responded to the two writing tasks. The discourse analysis focused on measures of accuracy, fluency, complexity, coherence, cohesion, content, orientation to source evidence, and metadiscourse. An analysis with a multivariate analysis of variance ( MANOVA ) using a two‐by‐five (task type by proficiency level) factorial design with random permutations showed that the discourse produced by the test takers varies significantly on most variables under investigation. The discourse produced at different score levels also generally differed significantly. The findings are discussed in terms of the TOEFL iBT test validity argument. Implications for rating scale validation and automated scoring are discussed.