Open Access
Authorship verification of opinion pieces in Estonian
Author(s) -
Timo Petmanson
Publication year - 2014
Publication title -
eesti rakenduslingvistika ühingu aastaraamat
Language(s) - English
Resource type - Journals
eISSN - 2228-0677
pISSN - 1736-2563
DOI - 10.5128/erya10.16
Subject(s) - estonian , authorship attribution , computer science , natural language processing , artificial intelligence , set (abstract data type) , recall , attribution , information retrieval , style (visual arts) , stylometry , writing style , linguistics , psychology , social psychology , history , programming language , philosophy , archaeology
Authorship verification is an important subproblem in authorship attribution and plagiarism detection tasks. We present a novel approach for extracting stylistic features unique to individual authors. We use the correlations of important textual features as a way to learn the style. The goal of our proposed method is to answer the following question: given a set of documents known to be written by the same person and an unknown document, is the unknown document also written by that individual. We present the first study of this problem conducted on opinion pieces written in Estonian. Our method achieves 74% precision, which is comparable with current state-of-the-art systems tested in other languages, whereas the recall level is still something to be improved on