Exploiting named entities for bilingual news clustering | Zendy

Montalvo Soto | Zendy; Martínez Raquel | Zendy; Fresno Víctor | Zendy; Delgado Agustín | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Exploiting named entities for bilingual news clustering

Author(s) -

Montalvo Soto,

Martínez Raquel,

Fresno Víctor,

Delgado Agustín

Publication year - 2015

Publication title -

journal of the association for information science and technology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.903

H-Index - 145

eISSN - 2330-1643

pISSN - 2330-1635

DOI - 10.1002/asi.23175

Subject(s) - computer science , cluster analysis , heuristic , information retrieval , document clustering , artificial intelligence , natural language processing , data mining

In this article, we present a new algorithm for clustering a bilingual collection of comparable news items in groups of specific topics. Our hypothesis is that named entities ( NE s) are more informative than other features in the news when clustering fine grained topics. The algorithm does not need as input any information related to the number of clusters, and carries out the clustering only based on information regarding the shared named entities of the news items. This proposal is evaluated using different data sets and outperforms other state‐of‐the‐art algorithms, thereby proving the plausibility of the approach. In addition, because the applicability of our approach depends on the possibility of identifying equivalent named entities among the news, we propose a heuristic system to identify equivalent named entities in the same and different languages, thereby obtaining good performance.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research