Automatic Detection of Words Associations in Texts Based on Joint Distribution of Words Occurrences | Zendy

Santoni Daniele | Zendy; Pourabbas Elaheh | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Automatic Detection of Words Associations in Texts Based on Joint Distribution of Words Occurrences

Author(s) -

Santoni Daniele,

Pourabbas Elaheh

Publication year - 2016

Publication title -

computational intelligence

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.353

H-Index - 52

eISSN - 1467-8640

pISSN - 0824-7935

DOI - 10.1111/coin.12065

Subject(s) - computer science , consistency (knowledge bases) , word (group theory) , artificial intelligence , representation (politics) , natural language processing , pattern recognition (psychology) , joint probability distribution , association (psychology) , function (biology) , correlation , joint (building) , data mining , mathematics , statistics , architectural engineering , philosophy , geometry , epistemology , evolutionary biology , politics , political science , law , biology , engineering

In this article, we propose a novel approach for measuring word association based on the joint occurrences distribution in a text. Our approach relies on computing a sum of distances between neighboring occurrences of a given word pair and comparing it with a vector of randomly generated occurrences. The idea behind this assumption is that if the distribution of co‐occurrences is close to random or if they tend to appear together less frequently than by chance, such words are not semantically related. We devise a distance function S that evaluates the words association rate. Using S , we build a concept tree , which provides a visual and comprehensive representation of keywords association in a text. In order to illustrate the effectiveness of our algorithm, we apply it to three different texts, showing the consistency and significance of the obtained results with respect to the semantics of documents. Finally, we compare the results obtained by applying our proposed algorithm with the ones achieved by both human experts and the co‐occurrence correlation method. We show that our method is consistent with the experts' evaluation and outperforms with respect to the co‐occurrence correlation method.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research