The Impact of the Pattern-Growth Ordering on the Performances of Pattern Growth-Based Sequential Pattern Mining Algorithms
Author(s) -
Edith Bélise Kenmogne
Publication year - 2016
Publication title -
computer and information science
Language(s) - English
Resource type - Journals
eISSN - 1913-8997
pISSN - 1913-8989
DOI - 10.5539/cis.v10n1p23
Subject(s) - computer science , data mining , pruning , prefix , field (mathematics) , set (abstract data type) , algorithm , sequential pattern mining , sequence (biology) , web mining , pattern search , tree (set theory) , suffix tree , pattern recognition (psychology) , data structure , artificial intelligence , mathematics , web page , mathematical analysis , philosophy , linguistics , genetics , world wide web , pure mathematics , agronomy , biology , programming language
Sequential Pattern Mining is an efficient technique for discovering recurring structures or patterns from very large dataset widely addressed by the data mining community, with a very large field of applications, such as cross-marketing, DNA analysis, web log analysis, user behavior, sensor data, etc. The sequence pattern mining aims at extracting a set of attributes, shared across time among a large number of objects in a given database. Previous studies have developed two major classes of sequential pattern mining methods, namely, the candidate generation-and-test approach based on either vertical or horizontal data formats represented respectively by GSP and SPADE, and the pattern-growth approach represented by FreeSpan and PrefixSpan. In this paper, we are interested in the study of the impact of the pattern-growth ordering on the performances of pattern growth-based sequential pattern mining algorithms. To this end, we introduce a class of pattern-growth orderings, called linear orderings, for which patterns are grown by making grow either the current pattern prefix or the current pattern suffix from the same position at each growth-step. We study the problem of pruning and partitioning the search space following linear orderings. Experimentations show that the order in which patterns grow has a significant influence on the performances.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom