Premium
Missing not at random and the nonparametric estimation of the spectral density
Author(s) -
Efromovich Sam
Publication year - 2020
Publication title -
journal of time series analysis
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.576
H-Index - 54
eISSN - 1467-9892
pISSN - 0143-9782
DOI - 10.1111/jtsa.12527
Subject(s) - missing data , mathematics , estimator , oracle , series (stratigraphy) , statistician , statistics , density estimation , parametric statistics , nonparametric statistics , sample (material) , econometrics , algorithm , computer science , paleontology , chemistry , software engineering , chromatography , biology
The aim of the article is twofold: (i) present a pivotal setting where using an extra experiment for restoring information lost due to missing not at random (MNAR) is practically feasible; (ii) attract attention to a wide spectrum of new research topics created by the proposed methodology of exploring the missing mechanism. It is well known that if the likelihood of missing an observation depends on its value, then the missing is MNAR, no consistent estimation is possible, and the only way to recover destroyed information is to study the likelihood of missing via an extra experiment. One of the main practical issues with an extra‐sample approach is as follows. Let n and m be the numbers of observations in a MNAR time series and in an extra sample exploring the likelihood of missing respectively. An oracle, that knows the likelihood of missing, can estimate the spectral density of an ARMA‐type spectral density with the MISE proportional to ln ( n ) n − 1 , while a differentiable likelihood may be estimated only with the MISE proportional to m −2/3 . On first glance, these familiar facts yield that the proposed approach is impractical because m must be in order larger than n to match the oracle. Surprisingly, the article presents the theory and a numerical study indicating that m may be in order smaller than n and still the statistician can match performance of the oracle. The proposed methodology is used for the analysis of MNAR time series of systolic blood pressure of a person with immunoglobulin D multiple myeloma. A number of possible extensions and future research topics are outlined.