Colloquium: "Modern Statistical Inference for Big and Streaming Data"

Shih-Kang Chao, Purdue University

Abstract: This talk illustrates the three "D" feature of my ongoing work: big Data, streaming Data, and high Dimensionality, motivated by online advertisement and recommendation system. Specifically, the first part develops a surrogate Bayesian analysis for big and high-dimensional data, which only requires one round of communication. The resulting posterior distribution enjoys strong theoretical guarantees such as concentration around the true parameter and the Bernstein von-Mises theorem. The second part treats high-dimensional streaming data and big data by adapting the regularized dual averaging method (Xiao, 2010) to l1 penalization and fixed learning rate. For this case, an interesting asymptotic trajectory and distributional dynamics are derived, which shed light on statistical inference and optimal model tuning for support recovery.

Host: Jimin Ding

Tea @3:45pm in room 200