Premium
Sampling from social network to maintain community structure
Author(s) -
Tong Chao,
Niu Jianwei,
Xie Zhongyu,
Peng Fu
Publication year - 2014
Publication title -
international journal of communication systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.344
H-Index - 49
eISSN - 1099-1131
pISSN - 1074-5351
DOI - 10.1002/dac.2815
Subject(s) - computer science , community structure , sampling (signal processing) , consistency (knowledge bases) , data mining , complex network , graph , social network (sociolinguistics) , network structure , limit (mathematics) , sample (material) , algorithm , artificial intelligence , theoretical computer science , statistics , mathematics , world wide web , computer vision , social media , mathematical analysis , chemistry , filter (signal processing) , chromatography
SUMMARY The research of network community structure based on a large number of complex network datasets is becoming popular in recent years. For the limit of existing computing capabilities and other conditions, such a large network data processing is becoming one of the hardest issues, so sampling algorithm research has become a new hot spot in network data analysis. Based on the needs of network structure research, in this paper, we propose an improved forest fires algorithm, which can not only decrease the scale of network data but also maintain the previous network community structure well. We define two concepts, namely ‘community degree’ and ‘center of community’ in the algorithm. Then the algorithm was applied on five datasets. In order to make it convenient for the comparison between our sampling algorithm and the other six sampling algorithms under different parameters, we use network community profile and Kolmogorov–Smirno D statistics to judge the consistency between the sample and the previous graph. Experiment results show that the improved algorithm is better than the other six sampling algorithms under most of the parameters. The efficiency and feasibility of the modified algorithm is also validated. Finally, we give the recommended values of different parameters. Copyright © 2014 John Wiley & Sons, Ltd.