Premium
Multivariate Weibull mixtures with proportional hazard restrictions for dwell‐time‐based session clustering with incomplete data
Author(s) -
Mair Patrick,
Hudec Marcus
Publication year - 2009
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/j.1467-9876.2009.00665.x
Subject(s) - weibull distribution , cluster analysis , censoring (clinical trials) , multivariate statistics , computer science , data mining , data set , hazard , mixture model , parametric statistics , dwell time , statistics , mathematics , artificial intelligence , machine learning , medicine , clinical psychology , chemistry , organic chemistry
Summary. Emanating from classical Weibull mixture models we propose a framework for clustering survival data with various more parsimonious models by imposing restrictions on the distributional parameters. We show that these restrictions on the Weibull mixtures correspond to different proportional hazard restrictions across mixture components and Web page areas. A parametric cluster approach based on the EM algorithm is carried out on a multivariate data set. Our model set‐up encompasses incomplete‐data structures as well as censoring observations. We apply the methodology on retail data stemming from a global e‐commerce company. Sessions are clustered with respect to the dwell times that a user spends on certain page areas. The cluster solution that is found allows for a detailed examination of the navigation behaviour in terms of the hazard and survivor functions within each component.