Szego Seminar: "Matrix Factorization and its Applications to Deep Learning"
Abstract: In the matrix completion problem, we are given a subset of entries of a matrix X and we want to predict the unseen entries. In general this is impossible, but under some assumptions such as low rank, we can predict the unseen entries with low error. One way to solve this problem is to write X as the product of matrices: X=W_1W_2…W_n, and then use gradient descent on the W_i under some objective function. Gunasekar et al. (2017) show that under certain assumptions gradient descent on a factorization with two matrices (X=W_1W_2) will converge to a solution with minimum nuclear norm. Arora et al. (2019) show that their proof extends to the case of arbitrarily long factorizations, while experimental results show that longer factorizations enhance the tendency of gradient descent to converge to low rank solutions. We will review these results and their applications in the theory of deep learning.
Host: Nathan Wagner