z-logo
open-access-imgOpen Access
Heavy path based super-sequence frequent pattern mining on web log dataset
Author(s) -
Xinran Yu,
Turgay Korkmaz
Publication year - 2015
Publication title -
artificial intelligence research
Language(s) - English
Resource type - Journals
eISSN - 1927-6982
pISSN - 1927-6974
DOI - 10.5430/air.v4n2p1
Subject(s) - computer science , path (computing) , sequence (biology) , heuristic , dynamic programming , data mining , graph , algorithm , theoretical computer science , artificial intelligence , biology , programming language , genetics
Mining web log datasets has been extensively studied using Frequent Pattern Mining (FPM) and its various other forms. Identifyingfrequent patterns in different sequences can help in analyzing the most common sub-sequences (e.g., the pages visitedtogether). However, this approach would not be able to identify general structures spanning over multiple sequences. In responseto understanding general structures, we introduce a new form of sequential pattern mining called super-sequence frequent patternmining (SS-FPM). In contrast to sub-sequences determined by FPM, SS-FPM determines the super-sequences that can containthe common parts from different sequences. This can be useful in understanding the general behavior/flow of users in web usagemining, classifying web pages and users, making predictions etc. In essence, finding frequent super-sequence patterns turnsout to be related to the well-known heaviest (longest) path problem in graphs, which is known to be NP-hard. Accordingly,we transform a given sequential dataset into a sequence graph and formulate the problem as k-hop heaviest path problem. Wethen propose an efficient heuristic called sequence matrix method using dynamic programming techniques. We compared ourmethod to the existing Heavypath method. The results show that our method is more efficient especially on large datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom