EE Seminar: Understanding and Enhancing Deep Neural Networks with Automated Interpretability

27 בינואר 2025, 12:00 
אולם 011, בניין כיתות חשמל 
EE Seminar: Understanding and Enhancing Deep Neural Networks with Automated Interpretability

(The talk will be given in English)

 

Speaker:     Dr. Tamar Rott Shaham

                        CSAIL, MIT

                          

011 hall, Electrical Engineering-Kitot Building‏

Monday, January 27th, 2025

12:00 - 13:00

 

Understanding and Enhancing Deep Neural Networks with Automated Interpretability

 

Abstract

Deep neural networks are becoming increasingly sophisticated; they can generate realistic images, engage in complex dialogues, analyze intricate data, and execute tasks that appear almost human-like. But how do such models achieve these abilities?

In this talk, I will present a line of work that aims to explain behaviors of deep neural networks. This includes a new approach for evaluating cross-domain knowledge encoded in generative models, tools for uncovering core mechanisms in large language models, and insights into how fine-tuning affects these mechanisms. Next, I will introduce the Automated Interpretability Agent (AIA), a system that automates and scales the scientific process of interpreting neural networks. When presented with an interpretability task—such as explaining the role of a specific neuron in a pre-trained model or identifying failures in the model’s predictions—AIA autonomously formulates hypotheses, designs iterative experiments on model internals, and refines explanations based on experimental outcomes. I will demonstrate how AIA’s findings can be leveraged to mitigate biases and enhance model performance. The talk will conclude with a discussion of future directions, including the development of universal interpretability tools and extending interpretability methods to automate scientific discovery.

Short Bio

Tamar Rott Shaham is a postdoctoral researcher at MIT CSAIL in Antonio Torralba’s lab. She earned her PhD from the Electrical and Computer Engineering faculty at the Technion, where she was supervised by Prof. Tomer Michaeli. Tamar has received several awards, including the ICCV 2019 Best Paper Award (Marr Prize), the Google WTM Scholarship, the Adobe Research Fellowship, the Rothchild Postdoctoral Fellowship, the Vatat-Zuckerman Postdoctoral Scholarship, and the Schmidt Postdoctoral Award.

 

השתתפות בסמינר תיתן קרדיט שמיעה לתלמידי תואר שני ושלישי = עפ"י רישום שם מלא + מספר ת.ז. בטופס הנוכחות שיועבר באולם במהלך הסמינר

 

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>