ENCADEAr: ENCADEAmento automático de notícias
Author(s) -
Carla Abreu,
Jorge Teixeira,
Eugénio Oliveira
Publication year - 2015
Publication title -
oslo studies in language
Language(s) - English
Resource type - Journals
ISSN - 1890-9639
DOI - 10.5617/osla.1457
Subject(s) - computer science , natural language processing , information extraction , artificial intelligence , information retrieval
This work aims at defining and evaluating different techniques to automatically build temporal news sequences. The approach proposed is composed by three steps: (i) near duplicate documents detention; (ii) keywords extraction; (iii) news sequences creation. This approach is based on: Natural Language Processing, Information Extraction, Name Entity Recognition and supervised learning algorithms. The proposed methodology got a precision of 93.1% for news chains sequences creation.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom