Statistics Seminar: "High-dimensional selective inference with applications"

Jelena Markovic, Stanford University

Abstract: Recently, Liu et al. (2018) devised a new selective inference approach of doing inference that is more powerful than the existing non-randomized selective inference methods, including Lee et al. (2016). To construct valid p-valued and confidence intervals for each of the variables selected by the Lasso, Liu et al. (2018) conditions only on that particular variable of interest being active and not on the whole active set as in Lee et al. (2016). Consequently, by conditioning on less information, the method of Liu et al. (2018) has more power, meaning shorter confidence intervals and more likely to detect true signals.

In this work, we extend the approach of Liu et al. (2018) to high-dimensional settings. To report valid inference for the selected variables, we correct for selection the asymptotic normal distribution of debiased Lasso estimator.

We apply several selective inference methods, including our new method, to the genetic data of controls and cases with Crohn's disease in the United Kingdom. We use a model selection procedure to select genetic variables/features associated with the response and selective inference methods to quantify only the significance of the selected genetic features. We compare the selective inference methods with knockoffs.

References: 

- Jason D Lee, Dennis L Sun, Yuekai Sun, and Jonathan E Taylor. Exact post- selection inference, with application to the lasso. The Annals of Statistics, 44(3):907{927, 2016.

- Keli Liu, Jelena Markovic, and Robert Tibshirani. More powerful post-selection inference, with application to the lasso. arXiv preprint arXiv:1801.09037, 2018.

This is a joint work with Kevin Fry, Keli Liu, Jonathan Taylor and Rob Tibshirani.

Host: Todd Kuffner