z-logo
open-access-imgOpen Access
ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory
Author(s) -
Mark A. Finlayson
Publication year - 2015
Publication title -
digital scholarship in the humanities
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.4
H-Index - 15
eISSN - 2055-768X
pISSN - 2055-7671
DOI - 10.1093/llc/fqv067
Subject(s) - annotation , computer science , natural language processing , artificial intelligence , context (archaeology) , referent , selection (genetic algorithm) , linguistics , workbench , narrative , history , philosophy , archaeology , visualization
I describe the collection and deep annotation of the semantics of a corpus of Russian folktales. This corpus, which I call the ‘ProppLearner’ corpus, was assembled to provide data for an algorithm designed to learn Vladimir Propp’s morphology of Russian hero tales. The corpus is the most deeply annotated narrative corpus available at this time. The algorithm and learning results are described elsewhere; here, I provide detail on the layers of annotation and how they were chosen, novel layers of annotation required for successful learning, the selection of the texts for annotation, the annotation process itself, and the resulting inter-annotator agreement measures. In particular, the corpus comprised fifteen texts totaling 18,862 words. There were eighteen layers of annotation, five of which were developed specifically to support learning Propp’s morphology: referent attributes, context relationships, event valences, Propp’s ‘dramatis personae’, and Propp’s functions. All annotations were created by trained annotators with the Story Workbench annotation tool, following a double-annotation paradigm. I discuss lessons learned from this effort and what they mean for future digital humanities efforts when working with the semantics of natural language text.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom