z-logo
open-access-imgOpen Access
On-the-Fly Output Compression for Join-Based Graph Mining Algorithms
Author(s) -
Mostofa Kamal Rasel,
Young-Koo Lee
Publication year - 2018
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2018.2878032
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Many join-based graph mining (JGM) algorithms, such as triangle listing and clique enumeration, typically output data of such a large size that it often dominates the mining cost. Despite the significance of the I/O cost, only a few research focused on reducing the size of the output data. However, those techniques have limitation that they are highly specific to their corresponding graph mining algorithms. In this paper, through careful observations of output patterns, we propose a general compression solution that can be applied to an arbitrary JGM algorithm. The proposed technique first categorizes the nodes of a set of cliques into two lists, which contain overlapping and non-overlapping nodes, respectively. Then, we remove the redundant nodes from multiple sets of cliques by creating a Clique Constructor Tree using the overlapping and non-overlapping nodes. Finally, we compress the clique constructor trees by applying a novel flag-aligned word encoding technique. Our proposed technique performs compression on-the-fly and can easily be adopted by various JGM algorithms. Experiments on real datasets show that our proposed technique, which is adopted in a triangle listing algorithm, reduces the size of the output data and running time by seven times and about four times, respectively. Experimental results also show that a maximal clique listing algorithm can reduce the size of the output data by a factor of three.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom