z-logo
open-access-imgOpen Access
Term Spotting: A Quick-and-dirty Method for Extracting Typological Features of Language from Grammatical Descriptions
Author(s) -
Harald Hammarström,
One-Soon Her,
Marc Allassonnière-Tang
Publication year - 2021
Publication title -
linköping electronic conference proceedings
Language(s) - English
Resource type - Conference proceedings
eISSN - 1650-3740
pISSN - 1650-3686
DOI - 10.3384/ecp184172
Subject(s) - computer science , artificial intelligence , spotting , natural language processing , term (time) , simple (philosophy) , probabilistic logic , annotation , semantics (computer science) , extant taxon , programming language , philosophy , physics , epistemology , quantum mechanics , evolutionary biology , biology
Starting from a large collection of digitized raw-text descriptions of languages of the world, we address the problem of extracting information of interest to linguists from these. We describe a general technique to extract properties of the described languages associated with a specific term. The technique is simple to implement, simple to explain, requires no training data or annotation, and requires no manual tuning of thresholds. The results are evaluated on a large gold standard database on classifiers with accuracy results that match or supersede human inter-coder agreement on similar tasks. Although accuracy is competitive, the method may still be enhanced by a more rigorous probabilistic background theory and usage of extant NLP tools for morphological variants, collocations and vector-space semantics.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here