Term Spotting: A Quick-and-dirty Method for Extracting Typological Features of Language from Grammatical Descriptions | Zendy

Harald Hammarström | Zendy; OneSoon Her | Zendy; Marc AllassonnièreTang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Term Spotting: A Quick-and-dirty Method for Extracting Typological Features of Language from Grammatical Descriptions

Author(s) -

Harald Hammarström,

OneSoon Her,

Marc AllassonnièreTang

Publication year - 2021

Publication title -

linköping electronic conference proceedings

Language(s) - English

Resource type - Conference proceedings

eISSN - 1650-3740

pISSN - 1650-3686

DOI - 10.3384/ecp184172

Subject(s) - computer science , artificial intelligence , spotting , natural language processing , term (time) , simple (philosophy) , probabilistic logic , annotation , semantics (computer science) , extant taxon , programming language , philosophy , physics , epistemology , quantum mechanics , evolutionary biology , biology

Starting from a large collection of digitized raw-text descriptions of languages of the world, we address the problem of extracting information of interest to linguists from these. We describe a general technique to extract properties of the described languages associated with a specific term. The technique is simple to implement, simple to explain, requires no training data or annotation, and requires no manual tuning of thresholds. The results are evaluated on a large gold standard database on classifiers with accuracy results that match or supersede human inter-coder agreement on similar tasks. Although accuracy is competitive, the method may still be enhanced by a more rigorous probabilistic background theory and usage of extant NLP tools for morphological variants, collocations and vector-space semantics.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research