z-logo
open-access-imgOpen Access
A hierarchical Bayesian approach for handling missing classification data
Author(s) -
Ketz Alison C.,
Johnson Therese L.,
Hooten Mevin B.,
Hobbs N. Thompson
Publication year - 2019
Publication title -
ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.17
H-Index - 63
ISSN - 2045-7758
DOI - 10.1002/ece3.4927
Subject(s) - missing data , categorical variable , spurious relationship , multinomial distribution , inference , bayesian probability , bayesian hierarchical modeling , bayes' theorem , computer science , bayesian inference , categorical distribution , bayes factor , statistics , econometrics , data mining , artificial intelligence , machine learning , mathematics
Ecologists use classifications of individuals in categories to understand composition of populations and communities. These categories might be defined by demographics, functional traits, or species. Assignment of categories is often imperfect, but frequently treated as observations without error. When individuals are observed but not classified, these “partial” observations must be modified to include the missing data mechanism to avoid spurious inference. We developed two hierarchical Bayesian models to overcome the assumption of perfect assignment to mutually exclusive categories in the multinomial distribution of categorical counts, when classifications are missing. These models incorporate auxiliary information to adjust the posterior distributions of the proportions of membership in categories. In one model, we use an empirical Bayes approach, where a subset of data from one year serves as a prior for the missing data the next. In the other approach, we use a small random sample of data within a year to inform the distribution of the missing data. We performed a simulation to show the bias that occurs when partial observations were ignored and demonstrated the altered inference for the estimation of demographic ratios. We applied our models to demographic classifications of elk ( Cervus elaphus nelsoni ) to demonstrate improved inference for the proportions of sex and stage classes. We developed multiple modeling approaches using a generalizable nested multinomial structure to account for partially observed data that were missing not at random for classification counts. Accounting for classification uncertainty is important to accurately understand the composition of populations and communities in ecological studies.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here