z-logo
open-access-imgOpen Access
Pragmatic annotation of a domain-restricted English-Spanish comparable corpus
Author(s) -
Rosa Rabadán,
Noelia Ramón García,
Hugo Sanjurjo-González
Publication year - 2021
Publication title -
bergen language and linguistics studies
Language(s) - English
Resource type - Journals
ISSN - 1892-2449
DOI - 10.15845/bells.v11i1.3445
Subject(s) - annotation , computer science , praise , metadata , natural language processing , artificial intelligence , scripting language , domain (mathematical analysis) , generality , scheme (mathematics) , information retrieval , linguistics , world wide web , psychology , mathematics , programming language , mathematical analysis , philosophy , psychotherapist
This paper explores the multi-layer annotation of a written domain-restricted English-Spanish comparable corpus (CLANES – Controlled LANguage English Spanish), focusing on pragmatic annotation. The annotation scheme draws on part of speech tagging and a semantic annotation scheme, i.e. the UCREL Semantic Analysis System, with some added categories to fit the food-and-drink domain represented in CLANES. These are used to build significant (pragmatic) metapatterns. Seven different pragmatic functions have been identified in our corpus, namely , , , , , and . Computer scripts translate this linguistic information into regular expressions to be used in unsupervised annotation. Partial results indicate that applying lexical restrictors boosts the success rate considerably. However, metadata is preferred because of increased replicability and generality. Replicability issues and limitations encountered during testing are also addressed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here