Premium
Distributed computing of distance‐based graph invariants for analysis and visualization of complex networks
Author(s) -
Czech Wojciech,
Mielczarek Wojciech,
Dzwinel Witold
Publication year - 2016
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.4054
Subject(s) - computer science , visualization , theoretical computer science , distance matrix , scalability , cluster analysis , computation , adjacency matrix , shortest path problem , vertex (graph theory) , clustering coefficient , graph , algorithm , data mining , artificial intelligence , database
Summary We present a new framework for analysis and visualization of complex networks based on structural information retrieved from their distance k ‐graphs and B‐matrices. The construction of B‐matrices for graphs with more than 1 million edges requires massive Breadth‐First Search (BFS) computations and is facilitated using new software prepared for distributed environments. Our framework benefits from data parallelism inherent to all‐pair shortest‐path problem and extends Cassovary, an open‐source in‐memory graph processing engine, to enable multinode computation of distance k ‐graphs and related graph descriptors. We also introduce a new type of B‐matrix, constructed using clustering coefficient vertex invariant, which can be generated with a computational effort comparable with the one required for a previously known degree B‐matrix, while delivering an additional set of information about graph structure. Our approach enables efficient generation of expressive, multidimensional descriptors useful in graph embedding and graph mining tasks. The experiments showed that the new framework is scalable and for specific all‐pair shortest‐path task provides better performance than existing generic graph processing frameworks. We further present how the developed tools helped in the analysis and visualization of real‐world graphs from Stanford Large Network Dataset Collection. Copyright © 2016 John Wiley & Sons, Ltd.