EE Seminar: Making Neural Networks Linear Again: Projection and Beyond
סמינר שמיעה לתלמידי תואר שני ושלישי
(The talk will be given in English)
Speaker: Dr. Assaf Shocher
NVIDIA
011 hall, Electrical Engineering-Kitot Building |
Monday, December 23rd, 2024
12:00 - 13:00
Making Neural Networks Linear Again: Projection and Beyond
Abstract
Every day, somewhere, a researcher mutters, “If only neural networks were linear, this problem would be solved”. Linear operations offer powerful tools: projection onto subspaces, eigen decomposition, and more. This talk explores their equivalents in the non-linear world of neural networks, with a special focus on projection, generalized by idempotent operators- operators that satisfy f(f(x)) = f(x).
Idempotent Generative Network (IGN) is a generative model that is trained by enforcing two main objectives: (1) target distribution data map to themselves f(x) = x, defining the target manifold, and (2) latents project onto this manifold via the idempotence condition f(f(z)) = f(z). IGN generates data in a single step, but can iteratively refine, and projects corrupted data back onto the distribution.
This projection ability gives rise to Idempotent Test-Time Training (IT³), a method to adapt models at test time using only current out-of-distribution (OOD) input. During training, the model f receives an input x along with either the ground truth label y or a neutral "don't know" signal ∅. At test-time, given corrupted/OOD input x, a brief training session minimizes ||f(x, f(x, ∅)) - f(x, ∅)||, making f(x,⋅) idempotent. IT³ works across architectures and tasks, demonstrated for MLPs, CNNs, and GNNs on corrupted images, tabular data, OOD facial age prediction, and aerodynamic predictions.
Finally, I'll ask: "Who says neural networks are non-linear?" They're only non-linear with respect to the standard vector spaces! In an ongoing work, we construct vector spaces X, Y with their own addition, negation, and scalar multiplication, where f: X → Y becomes truly linear. This enables novel applications including spectral decomposition, zero-shot solutions to non-linear inverse problems via Pseudo-Inverse, and architecture-enforced idempotence.
Short Bio
I am a postdoctoral researcher at NVIDIA. Prior to that I was a postdoctoral fellow at UC Berkeley, working with Alyosha Efros, and a visiting researcher at Google. I received my PhD from the Weizmann Institute of Science, where I was advised by Michal Irani. I have bachelor's degrees in Physics and EE from Ben-Gurion University. My prizes and honors include the Rothschild postdoctoral fellowship, the Fulbright postdoctoral fellowship, John F. Kennedy award for outstanding Ph.D. at the Weizmann Institute, and the Blavatnik award for CS Ph.D. graduates.
השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום שם מלא + מספר ת.ז. בטופס הנוכחות שיועבר באולם במהלך הסמינר