z-logo
open-access-imgOpen Access
Term Spotting: A Quick-and-dirty Method for Extracting Typological Features of Language from Grammatical Descriptions
Author(s) -
Harald Hammarström,
OneSoon Her,
Marc AllassonnièreTang
Publication year - 2021
Publication title -
linköping electronic conference proceedings
Language(s) - English
Resource type - Conference proceedings
eISSN - 1650-3740
pISSN - 1650-3686
DOI - 10.3384/ecp184172
Subject(s) - computer science , artificial intelligence , spotting , natural language processing , term (time) , simple (philosophy) , probabilistic logic , annotation , semantics (computer science) , extant taxon , programming language , philosophy , physics , epistemology , quantum mechanics , evolutionary biology , biology
Starting from a large collection of digitized raw-text descriptions of languages of the world, we address the problem of extracting information of interest to linguists from these. We describe a general technique to extract properties of the described languages associated with a specific term. The technique is simple to implement, simple to explain, requires no training data or annotation, and requires no manual tuning of thresholds. The results are evaluated on a large gold standard database on classifiers with accuracy results that match or supersede human inter-coder agreement on similar tasks. Although accuracy is competitive, the method may still be enhanced by a more rigorous probabilistic background theory and usage of extant NLP tools for morphological variants, collocations and vector-space semantics.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom