Data preprocessing evaluation for web log mining: reconstruction of activities of a web visitor
Author(s) -
Michal Munk,
Jozef Kapusta,
Peter Švec
Publication year - 2010
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2010.04.255
Subject(s) - computer science , web log analysis software , visitor pattern , personalization , data pre processing , preprocessor , data mining , web analytics , focus (optics) , information retrieval , world wide web , database , web service , web modeling , web intelligence , web api , physics , optics , artificial intelligence , programming language
Presumptions of each data analysis are data themselves, regardless of the analysis focus (visit rate analysis, optimization of portal, personalization of portal, etc.). Results of selected analysis highly depend on the quality of analyzed data. In case of portal usage analysis, these data can be obtained by monitoring web server log file. We are able to create data matrices and web map based on these data which will serve for searching for behaviour patterns of users. Data preparation from the log file represents the most time-consuming phase of whole analysis. We realized an experiment so that we can find out to which criteria are necessary to realize this time-consuming data preparation. We aimed at specifying the inevitable steps that are required for obtaining valid data from the log file. Specially, we focused on the reconstruction of activities of the web visitor. This advanced technique of data preprocessing belongs to time consuming one. In the article we tried to assess the impact of reconstruction of activities of a web visitor on the quantity and quality of the extracted rules which represent the web users’ behaviour patterns
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom