z-logo
open-access-imgOpen Access
Stable Classification of Text Genres
Author(s) -
Philipp Petrenz,
Bonnie Webber
Publication year - 2011
Publication title -
computational linguistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.314
H-Index - 98
eISSN - 1530-9312
pISSN - 0891-2017
DOI - 10.1162/coli_a_00052
Subject(s) - computer science , natural language processing , artificial intelligence , stability (learning theory) , face (sociological concept) , ideal (ethics) , speech recognition , linguistics , machine learning , philosophy , epistemology
Every text has at least one topic and at least one genre. Evidence for a text's topic and genre comes, in part, from its lexical and syntactic features-features used in both Automatic Topic Classification and Automatic Genre Classification (AGC). Because an ideal AGC system should be stable in the face of changes in topic distribution, we assess five previously published AGC methods with respect to both performance on the same topic-genre distribution on which they were trained and stability of that performance across changes in topic-genre distribution. Our experiments lead us to conclude that (1) stability in the face of changing topical distributions should be added to the evaluation critera for new approaches to AGC, and (2) Part-of-Speech features should be considered individually when developing a high-performing, stable AGC system for a particular, possibly changing corpus.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom