Bridging the Native Language and Language Variety Identification Tasks | Zendy

Marc Franco-Salvador | Zendy; Greg Kondrak | Zendy; Paolo Rosso | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Bridging the Native Language and Language Variety Identification Tasks

Author(s) -

Marc Franco-Salvador,

Greg Kondrak,

Paolo Rosso

Publication year - 2017

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2017.08.068

Subject(s) - computer science , variety (cybernetics) , bridging (networking) , language identification , natural language processing , identification (biology) , artificial intelligence , string (physics) , word (group theory) , task (project management) , language model , linguistics , natural language , computer network , philosophy , botany , physics , management , quantum mechanics , economics , biology

The objective of Native Language Identification is to determine the native language of the author of a text that he or she wrote in another language. By contrast, Language Variety Identification aims at classifying texts representing different varieties of a single language. We postulate that both tasks may be reduced to a single objective, which is to identify the language variety of the text. We design a general approach that combines string kernels and word embeddings, which capture different characteristics of texts. The results of our experiments show that the approach achieves excellent results on both tasks, without any task-specific adaptations.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research