z-logo
Premium
Model‐based clustering of time‐dependent categorical sequences with application to the analysis of major life event patterns
Author(s) -
Zhang Yingying,
Melnykov Volodymyr,
Zhu Xuwen
Publication year - 2021
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11502
Subject(s) - categorical variable , cluster analysis , event (particle physics) , computer science , bayesian probability , data mining , set (abstract data type) , markov chain , transition (genetics) , bayesian information criterion , artificial intelligence , machine learning , biochemistry , chemistry , physics , quantum mechanics , gene , programming language
Clustering categorical sequences is a problem that arises in many fields. There is a few techniques available in this framework but none of them take into account the possible temporal character of transitions from one state to another. A mixture of Markov models is proposed, where transition probabilities are represented as functions of time. The corresponding expectation–maximization algorithm is discussed along with related computational challenges. The effectiveness of the proposed procedure is illustrated on the set of simulation studies, in which it outperforms four alternative approaches. The method is applied to major life event sequences from the British Household Panel Survey. As reflected by Bayesian Information Criterion, the proposed model demonstrates substantially better performance than its competitors. The analysis of obtained results and related transition probability plots reveals two groups of individuals: people with a conventional development of life course and those encountering some challenges.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here