
The proximity of ideas: An analysis of patent text using machine learning
Author(s) -
Sijie Feng
Publication year - 2020
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0234880
Subject(s) - similarity (geometry) , measure (data warehouse) , space (punctuation) , cosine similarity , computer science , field (mathematics) , key (lock) , patent analysis , data science , artificial intelligence , similarity measure , information retrieval , data mining , mathematics , pattern recognition (psychology) , pure mathematics , computer security , image (mathematics) , operating system
This paper introduces a measure of the proximity in ideas using unsupervised machine learning. Knowledge transfers are considered a key driving force of innovation and regional economic growth. I explore knowledge relationships by deriving vector space representations of a patent’s abstract text using Document Vectors (Doc2Vec), and using cosine similarity to measure their proximity in ideas space. I illustrate the potential uses of this method with an application to geographic localization in knowledge spillovers. For patents in the same technology field, their normalized text similarity is 0.02-0.05 S.D.s higher if they are located within the same city, compared to patents from other cities. This effect is much smaller than when knowledge transfers are measured using normalized patent citations: local patents receive about 0.23-0.30 S.D.s more local citations than compared to non-local control patents. These findings suggest that the effect of geography on knowledge transfers may be much smaller than the previous literature using citations suggests.