Un método de análisis de lenguaje tipo SMS para el castellano | Zendy

José María Gómez Hidalgo | Zendy; Andrés Alfonso Caurcel Díaz | Zendy; Yovan Iñiguez del Rio | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Un método de análisis de lenguaje tipo SMS para el castellano

Author(s) -

José María Gómez Hidalgo,

Andrés Alfonso Caurcel Díaz,

Yovan Iñiguez del Rio

Publication year - 2013

Publication title -

linguamática

Language(s) - English

DOI - 10.21814/lm.5.1.156

The usage of specific language codes and chat and SMS-like messages is a major trend in electronic communications. This fact makes Natrual Language Processing quite hard, even at the simplest step fo text message tokenization, due to the widespread usage of non-alphanumeric symbols, frequent typos and non-standard word separators. In this work we present a new approach for text message tokenization, specific for the Spanish language as used in Social Networks and in electronic communications. Our system has been integrated in a more general application for age-detection in Social Networks developed in the research and development project WENDY, and it has been quantitatively evaluated both in a direct fashion, and indirectly by its impact on the genearl age-detection application, showing very promising results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research