Adding linguistic information to parsed corpora | Zendy

Susan Pintzuk | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Adding linguistic information to parsed corpora

Author(s) -

Susan Pintzuk

Publication year - 2019

Publication title -

linguistic issues in language technology

Language(s) - English

Resource type - Journals

eISSN - 1945-3590

pISSN - 1945-3604

DOI - 10.33011/lilt.v18i.1435

Subject(s) - computer science , parsing , annotation , natural language processing , artificial intelligence , style (visual arts) , information retrieval , archaeology , history

No matter how comprehensively corpus builders design their annotation schemes, users frequently find that information is missing that they need for their research. In this methodological paper I describe and illustrate five methods of adding linguistic information to corpora that have been morphosyntactically annotated (=parsed) in the style of Penn treebanks. Some of these methods involve manual operations; some are executed by CorpusSearch functions; some require a combination of manual and automated procedures. Which method is used depends almost entirely on the type of information to be added and the goals of the user. Of course, the main goal, regardless of method, is to record within the corpus additional information that can be used for analysis and also retained through further searches and data processing.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research