z-logo
Premium
On supervised mining of dynamic content‐based networks 1
Author(s) -
Aggarwal Charu C.,
Li Nan
Publication year - 2012
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10140
Subject(s) - computer science , cluster analysis , data mining , process (computing) , node (physics) , linkage (software) , content (measure theory) , information retrieval , machine learning , artificial intelligence , mathematical analysis , mathematics , biochemistry , chemistry , structural engineering , engineering , gene , operating system
In recent years, a large amount of information has become available online in the form of web documents, social networks, or blogs. Such networks are large, heterogeneous, and often contain a huge number of links. This linkage structure encodes rich structural information about the topical behavior of the network. Such networks are often dynamic and evolve rapidly over time. Much of the work in the literature has focused on classification either with purely text behavior or with purely linkage behavior. Furthermore, the work in the literature is mostly designed for static networks. However, a given network may be quite diverse, and the use of either content or structure could be more or less effective in different parts of the network. In this paper, we examine the problem of node classification in dynamic information networks with both text content and links. Our techniques use a random walk approach in conjunction with the content of the network to facilitate an effective classification process. Our approach is dynamic, and can be applied to networks which are updated incrementally. Our results suggest that an approach based on both content and links is extremely robust and effective. We also present methods to perform supervised keyword‐based clustering of nodes using this approach. We present experimental results illustrating the effectiveness and efficiency of our classification approach. We also show that the approach is able to find effective and coherent clusters. © 2012 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 5: 16–34, 2012

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here