z-logo
open-access-imgOpen Access
The FGLOCTweet Corpus: An English tweet-based corpus for fine-grained location-detection tasks
Author(s) -
Nicolás José Fernández-Martínez
Publication year - 2022
Publication title -
research in corpus linguistics
Language(s) - English
Resource type - Journals
ISSN - 2243-4712
DOI - 10.32714/ricl.10.01.06
Subject(s) - computer science , annotation , social media , locative case , natural language processing , task (project management) , bridge (graph theory) , corpus linguistics , representation (politics) , artificial intelligence , information retrieval , world wide web , linguistics , medicine , philosophy , management , politics , political science , law , economics
Location detection in social-media microtexts is an important natural language processing task for emergency-based contexts where locative references are identified in text data. Spatial information obtained from texts is essential to understand where an incident happened, where people are in need of help and/or which areas have been affected. This information contributes to raising emergency situation awareness, which is then passed on to emergency responders and competent authorities to act as quickly as possible. Annotated text data are necessary for building and evaluating location-detection systems. The problem is that available corpora of tweets for location-detection tasks are either lacking or, at best, annotated with coarse-grained location types (e.g. cities, towns, countries, some buildings, etc.). To bridge this gap, we present our semi-automatically annotated corpus, the Fine-Grained LOCation Tweet Corpus (FGLOCTweet Corpus), an English tweet-based corpus for fine-grained location-detection tasks, including fine-grained locative references (i.e. geopolitical entities, natural landforms, points of interest and traffic ways) together with their surrounding locative markers (i.e. direction, distance, movement or time). It includes annotated tweet data for training and evaluation purposes, which can be used to advance research in location detection, as well as in the study of the linguistic representation of place or of the microtext genre of social media.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here