z-logo
open-access-imgOpen Access
Representative entry selection for profiling blogs
Author(s) -
Jinfeng Zhuang,
Steven C. H. Hoi,
Aixin Sun,
Rong Jin
Publication year - 2008
Publication title -
singapore management university institutional knowledge (ink) (singapore management university)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/1458082.1458293
Subject(s) - submodular set function , computer science , profiling (computer programming) , representativeness heuristic , selection (genetic algorithm) , greedy algorithm , task (project management) , data mining , information retrieval , machine learning , algorithm , mathematical optimization , mathematics , engineering , operating system , statistics , systems engineering
Many applications on blog search and mining often meet the challenge of handling huge volume of blog data, in which one single blog could contain hundreds or even thousands of entries. We investigate novel techniques for profiling blogs by selecting a subset of representative entries for each blog. We propose two principles for guiding the entry selection task: representativeness and diversity. Further, we formulate the entry selection task into a combinatorial optimization problem and propose a greedy yet effective algorithm for finding a good approximate solution by exploiting the theory of submodular functions. We suggest blog classification for judging the performance of the proposed entry selection techniques and evaluate their performance on a real blog dataset, in which encouraging results were obtained

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom