z-logo
Premium
Summarizing and understanding large graphs
Author(s) -
Koutra Danai,
Kang U,
Vreeken Jilles,
Faloutsos Christos
Publication year - 2015
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11267
Subject(s) - computer science , adjacency matrix , theoretical computer science , bipartite graph , automatic summarization , graph , combinatorics , algorithm , mathematics , artificial intelligence
How can we succinctly describe a million‐node graph with a few simple sentences? Given a large graph, how can we find its most “important” structures, so that we can summarize it and easily visualize it? How can we measure the “importance” of a set of discovered subgraphs in a large graph? Starting with the observation that real graphs often consist of stars, bipartite cores, cliques, and chains, our main idea is to find the most succinct description of a graph in these “vocabulary” terms. To this end, we first mine candidate subgraphs using one or more graph partitioning algorithms. Next, we identify the optimal summarization using the minimum description length (MDL) principle, picking only those subgraphs from the candidates that together yield the best lossless compression of the graph—or, equivalently, that most succinctly describe its adjacency matrix. Our contributions are threefold: (i) formulation : we provide a principled encoding scheme to identify the vocabulary type of a given subgraph for six structure types prevalent in real‐world graphs, (ii) algorithm : we develop VoG, an efficient method to approximate the MDL‐optimal summary of a given graph in terms of local graph structures, and (iii) applicability : we report an extensive empirical evaluation on multimillion‐edge real graphs, including Flickr and the Notre Dame web graph.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here