z-logo
Premium
Coverage‐adjusted entropy estimation
Author(s) -
Vu Vincent Q.,
Yu Bin,
Kass Robert E.
Publication year - 2007
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2942
Subject(s) - estimator , mathematics , statistics , upper and lower bounds , entropy (arrow of time) , population , principle of maximum entropy , consistent estimator , minimum variance unbiased estimator , mathematical analysis , physics , demography , quantum mechanics , sociology
Data on ‘neural coding’ have frequently been analyzed using information‐theoretic measures. These formulations involve the fundamental and generally difficult statistical problem of estimating entropy. We review briefly several methods that have been advanced to estimate entropy and highlight a method, the coverage‐adjusted entropy estimator (CAE), due to Chao and Shen that appeared recently in the environmental statistics literature. This method begins with the elementary Horvitz–Thompson estimator, developed for sampling from a finite population, and adjusts for the potential new species that have not yet been observed in the sample—these become the new patterns or ‘words’ in a spike train that have not yet been observed. The adjustment is due to I. J. Good, and is called the Good–Turing coverage estimate. We provide a new empirical regularization derivation of the coverage‐adjusted probability estimator, which shrinks the maximum likelihood estimate. We prove that the CAE is consistent and first‐order optimal, with rate O P (1/log n ), in the class of distributions with finite entropy variance and that, within the class of distributions with finite q th moment of the log‐likelihood, the Good–Turing coverage estimate and the total probability of unobserved words converge at rate O P (1/(log n ) q ). We then provide a simulation study of the estimator with standard distributions and examples from neuronal data, where observations are dependent. The results show that, with a minor modification, the CAE performs much better than the MLE and is better than the best upper bound estimator, due to Paninski, when the number of possible words m is unknown or infinite. Copyright © 2007 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here