Richard D. Remington Distinguished University Professor
Department of Biostatistics
University of Michigan
“Conditions for Ignoring the Missing-Data Mechanism in Likelihood Inferences for Parameter Subsets”
For likelihood-based inferences from data with missing values, models are generally needed for both the data and the missing-data mechanism. However, modeling the mechanism is challenging, and parameters are often poorly identified. Rubin (1976 Biometrika) showed that for likelihood and Bayesian inference, sufficient conditions for ignoring the missing data mechanism are (a) the missing data are missing at random (MAR), in the sense that missingness does not depend on the missing values after conditioning on the observed data, and (b) the parameters of the data model and the missing-data mechanism are distinct; that is, there are no a priori ties, via parameter space restrictions or prior distributions, between these two sets of parameters. These conditions are sufficient but not always necessary, and they relate to the full vector of parameters of the model. We propose definitions of partially MAR and ignorability for a subset of the parameters of particular substantive interest, for direct likelihood/Bayesian and frequentist likelihood-based inference. We apply these definitions to a variety of examples. We also discuss conditioning on the pattern of missingness as an alternative strategy for avoiding the need to model the missing data mechanism.
This is joint work with Don Rubin and Sahar Zangeneh.