Department of Operations Research and Financial Engineering
SGSA Sponsored Event
Robust Inference for High-Dimensional Data with Application to False Discovery Rate Control
Heavy tailed distributions arise easily from high-dimensional data and they are at odd with commonly used sub-Gaussian assumptions. This prompts us to revisit the Huber estimator from a new perspective. The key observation is that the robustification parameter should adapt to sample size, dimension and moments for an optimal bias-robustness tradeoff: a small robustification parameter increases the robustness of estimation but introduces more biases. Our framework is able to handle heavy-tailed data with bounded (1 + δ)-th moment for any δ > 0. We establish a sharp phase transition for robust estimation of regression parameters in both low and high dimensions: when δ > 1, the estimator exhibits the optimal sub-Gaussian deviation bound, while only a slower rate is available in the regime 0 < δ < 1. The transition is smooth and optimal. Moreover, a nonasymptotic Bahadur representation for finite-sample inference is derived with finite variance, which further yields two important normal approximation results, the Berry-Esseen inequality and Cramér-type moderate deviation. As an important application, we apply these robust normal approximation results to analyze a dependence-adjusted multiple testing procedure. It is shown that this robust, dependence-adjusted procedure asymptotically controls the overall false discovery proportion (FDP) at the nominal level under mild moment conditions. Thorough numerical results on both simulated and real datasets are also provided to back up our theory.
Friday, 2/2/2018, BLOC 113, 11:30 AM