Open Access
ENCADEAr: ENCADEAmento automático de notícias
Author(s) -
Carla de Abreu,
Jorge Teixeira,
Eugénio C. Oliveira
Publication year - 2015
Publication title -
oslo studies in language
Language(s) - English
Resource type - Journals
ISSN - 1890-9639
DOI - 10.5617/osla.1457
Subject(s) - computer science , natural language processing , information extraction , artificial intelligence , information retrieval
This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.