Quantitative Authorship Attribution: An Evaluation of Techniques
Author(s) -
John Grieve
Publication year - 2007
Publication title -
literary and linguistic computing
Language(s) - English
Resource type - Journals
eISSN - 1477-4615
pISSN - 0268-1145
DOI - 10.1093/llc/fqm020
Subject(s) - authorship attribution , attribution , computer science , set (abstract data type) , sample (material) , natural language processing , information retrieval , scale (ratio) , artificial intelligence , data science , psychology , social psychology , chemistry , physics , chromatography , quantum mechanics , programming language
The basic assumption of quantitative authorship attribution is that the author of a text can be selected from a set of possible authors by comparing the values of textual measurements in that text to their corresponding values in each possible author's writing sample. Over the past three centuries, many types of textual measurements have been proposed, but never before have the majority of these measurements been tested on the same dataset. A large-scale comparison of textual measurements is crucial if current techniques are to be used effectively and if new and more powerful techniques are to be developed. This article presents the results of a comparison of thirty-nine different types of textual measurements commonly used in attribution studies, in order to determine which are the best indicators of authorship. Based on the results of these tests, a more accurate approach to quantitative authorship attribution is proposed, which involves the analysis of many different textual measurements.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom