Premium
Bursty event detection from microblog: a distributed and incremental approach
Author(s) -
Li Jianxin,
Wen Jianfeng,
Tai Zhenying,
Zhang Richong,
Yu Weiren
Publication year - 2015
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.3657
Subject(s) - microblogging , computer science , social media , popularity , event (particle physics) , spark (programming language) , data stream , real time computing , world wide web , psychology , social psychology , telecommunications , physics , quantum mechanics , programming language
Summary As a new form of social media, microblogs (e.g., Twitter and Weibo) are playing an important role in people's daily life. With the rise in popularity and size of microblogs, there is a need for distributed approaches that can detect bursty event with low latency from the short‐text data stream. In this paper, we propose a distributed and incremental temporal topic model for microblogs called Bursty Event dEtection (BEE+). BEE+ is able to detect bursty events from short‐text dataset and model the temporal information. And BEE+ processes the post‐stream incrementally to track the topic drifting of events over time. Therefore, the latent semantic indices are preserved from one time period to the next. In order to achieve real‐time processing, we design a distributed execution framework based on Spark engine. To verify its ability to detect bursty event, we conduct experiments on a Weibo dataset of 6,360,125 posts. The results show that BEE+ can outperform the baselines for detecting the meaningful bursty events and track the topic drifting. Copyright © 2015 John Wiley & Sons, Ltd.