z-logo
Premium
Mining and tracking evolving web user trends from large web server logs
Author(s) -
Hawwash Basheer,
Nasraoui Olfa
Publication year - 2010
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10069
Subject(s) - computer science , cluster analysis , scalability , web mining , data mining , set (abstract data type) , web analytics , web page , web server , information retrieval , world wide web , web modeling , web intelligence , the internet , database , machine learning , programming language
Recently, online organizations became interested in tracking users' behavior on their websites to better understand and satisfy their needs. In response to this need, web usage mining tools were developed to help them use web logs to discover usage patterns or profiles. However, since website usage logs are being continuously generated, in some cases, amounting to a dynamic data stream, most existing tools are still not able to handle their changing nature or growing size. This paper proposes a scalable framework that is capable of tracking the changing nature of user behavior on a website, and represent it in a set of evolving usage profiles. These profiles can offer the best usage representation of user activity at any given time, and they can be used as an input to higher‐level applications such as a web recommendation system. Our specific aim is to make the hierarchical unsupervised niche clustering (HUNC) algorithm more scalable, and to add integrated profile tracking and cluster‐based validation to it. Our experiments on real web log data confirm the validity of our approach for large data sets that previously could not be handled in one shot. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 106‐125, 2010

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here