Premium
Forest resampling for distributed sequential Monte Carlo
Author(s) -
Lee Anthony,
Whiteley Nick
Publication year - 2016
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.11280
Subject(s) - resampling , computer science , leverage (statistics) , monte carlo method , stability (learning theory) , tree (set theory) , theoretical computer science , algorithm , distributed computing , data mining , artificial intelligence , machine learning , mathematics , statistics , mathematical analysis
This paper brings explicit considerations of distributed computing architectures and data structures into the rigorous design of Sequential Monte Carlo (SMC) methods. A theoretical result established recently by the authors shows that adapting interaction between particles to suitably control the effective sample size (ESS) is sufficient to guarantee stability of SMC algorithms. Our objective is to leverage this result and devise algorithms which are thus guaranteed to work well in a distributed setting. We make three main contributions to achieve this. First, we study mathematical properties of the ESS as a function of matrices and graphs that parameterize the interaction among particles. Secondly, we show how these graphs can be induced by tree data structures which model the logical network topology of an abstract distributed computing environment. Finally, we present efficient distributed algorithms that achieve the desired ESS control, perform resampling and operate on forests associated with these trees. © 2015 Wiley Periodicals, Inc. Statistical Analysis and Data Mining: The ASA Data Science Journal, 2015