Premium
S em G raph: Extracting keyphrases following a novel semantic graph‐based approach
Author(s) -
MartinezRomo Juan,
Araujo Lourdes,
Duque Fernandez Andres
Publication year - 2016
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23365
Subject(s) - computer science , graph , novelty , information retrieval , artificial intelligence , natural language processing , theoretical computer science , theology , philosophy
Keyphrases represent the main topics a text is about. In this article, we introduce S em G raph, an unsupervised algorithm for extracting keyphrases from a collection of texts based on a semantic relationship graph. The main novelty of this algorithm is its ability to identify semantic relationships between words whose presence is statistically significant. Our method constructs a co‐occurrence graph in which words appearing in the same document are linked, provided their presence in the collection is statistically significant with respect to a null model. Furthermore, the graph obtained is enriched with information from W ord N et. We have used the most recent and standardized benchmark to evaluate the system ability to detect the keyphrases that are part of the text. The result is a method that achieves an improvement of 5.3% and 7.28% in F measure over the two labeled sets of keyphrases used in the evaluation of S em E val‐2010.