Premium
A novel method for topic linkages between scientific publications and patents
Author(s) -
Xu Shuo,
Zhai Dongsheng,
Wang Feifei,
An Xin,
Pang Hongshen,
Sun Yirong
Publication year - 2019
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.24175
Subject(s) - computer science , similarity (geometry) , topic model , divergence (linguistics) , cluster analysis , inference , focus (optics) , citation , data science , information retrieval , word (group theory) , data mining , artificial intelligence , world wide web , mathematics , linguistics , philosophy , physics , geometry , optics , image (mathematics)
It is increasingly important to build topic linkages between scientific publications and patents for the purpose of understanding the relationships between science and technology. Previous studies on the linkages mainly focus on the analysis of nonpatent references on the front page of patents, or the resulting citation‐link networks, but with unsatisfactory performance. In the meanwhile, abundant mentioned entities in the scholarly articles and patents further complicate topic linkages. To deal with this situation, a novel statistical entity‐topic model (named the CCorrLDA2 model), armed with the collapsed Gibbs sampling inference algorithm, is proposed to discover the hidden topics respectively from the academic articles and patents. In order to reduce the negative impact on topic similarity calculation, word tokens and entity mentions are grouped by the Brown clustering method. Then a topic linkages construction problem is transformed into the well‐known optimal transportation problem after topic similarity is calculated on the basis of symmetrized Kullback–Leibler (KL) divergence. Extensive experimental results indicate that our approach is feasible to build topic linkages with more superior performance than the counterparts.