Statistics and Data Science Seminar: "Implicit effect of Gaussian noise injections in training neural networks"

Speaker: Xiaoyu Wang, Washington University in Saint Louis

Abstract: Gaussian noise injections (GNIs) are widely used to train neural networks, where one injects Guassian noise to the network activation at every iteration of the optimization procedure, which is typically chosen as stochastic gradient descent (SGD). In this talk, we will first quick review SGD and popular variants of SGD equipped with Langevin dynamics, we then focus on the so-called 'implicit effect" of GNIs, which is the effect of the injected noise on the dynamics of SGD. We show this effect induces an asymmetric heavy-tailed noise on SGD gradient updates. To model the dynamics, we first propose a Langevin-like stochastic differential equation driven by asymmetric heavy-tailed noise. We then formally prove and quantify an "implicit bias" induced by GNIs, which varies depending on the heaviness of the tails and the level of the asymmetry. Empirical results confirm the "implicit effect" of GNIs induces an "implicit bias" that degrades networks performances.

Host: Likai Chen

Access Zoom Meeting