EE Seminar: Generalization in Overparameterized Machine Learning
Zoom link: https://us02web.zoom.us/j/83932011090?pwd=WjlxK2hOczFvUkQxNy9yQXFLVzJaUT09
Meeting ID: 839 3201 1090
Speaker: Dr. Yehuda Dar
Electrical and Computer Engineering Department at Rice University
Monday, November 23rd, 2020, at 15:00
Generalization in Overparameterized Machine Learning
Modern machine learning models are highly overparameterized (i.e., they are very complex with many more parameters than the number of training data examples), and yet they often generalize extremely well to inputs outside of the training set. This practical generalization performance motivates numerous foundational research questions that fall outside the scope of conventional machine learning concepts, such as the bias-variance tradeoff.
This talk presents new analyses of the fundamental factors that affect generalization in machine learning of overparameterized models. We focus on generalization errors that follow a double descent shape with respect to the number of parameters in the learned model. In the double descent shape, the generalization error arrives at its peak when the learned model starts to perfectly fit the training data; but then the error begins to decrease again in the overparameterized regime. Moreover, the global minimum of the generalization error can be achieved by a highly complex (overparameterized) model even without explicit regularization. The first part of the talk considers a transfer learning process between source and target linear regression problems that are related and overparameterized. Our statistical analysis demonstrates that the generalization error of the target task has a two-dimensional double descent shape that is significantly influenced by the transfer learning aspects. Our theory also characterizes the cases where transfer of parameters is beneficial. The second part of the talk introduces a new family of linear subspace learning problems that connect the subspace fitting (using principal component analysis) and regression approaches to the problem. We establish a numerical optimization framework that demonstrates the effects of supervision level and structural constraints on the double descent shape of the generalization error curve.
Yehuda Dar is a postdoctoral research associate in the Electrical and Computer Engineering Department at Rice University, working with Prof. Richard Baraniuk on topics in the theory of modern machine learning. Before that he was a postdoctoral fellow in the Computer Science Department of the Technion — Israel Institute of Technology, where he also received his PhD in 2018. Yehuda earned his MSc in Electrical Engineering and a BSc in Computer Engineering, both also from the Technion. His main research interests are in the fields of machine learning theory, signal and image processing, optimization, and data compression.
השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום שם מלא + מספר ת.ז. בצ'אט