EE Seminar: Estimating the Information Flow in Deep Neural Networks

23 בדצמבר 2018, 15:00 
חדר 011, בניין כיתות-חשמל 

(The talk will be given in English)

 

Speaker:     Dr. Ziv Goldfeld

Laboratory for Information and Decision Systems (LIDS) at MIT

 

SUNDAY, December 23rd, 2018
15:00 - 16:00

Room 011, Kitot Bldg., Faculty of Engineering

 

Estimating the Information Flow in Deep Neural Networks

 

Abstract

This talk will discuss the flow of information and the evolution of internal representations during deep neural network (DNN) training, aiming to demystify the compression aspect of the Information Bottleneck theory. The theory suggests that DNN training comprises a rapid fitting phase followed by a slower compression phase, in which the mutual information I(X;T) between the input X and internal representations T decreases. Several papers observe compression of estimated mutual information on different DNN models, but the true I(X;T) over these networks is provably either constant (discrete X) or infinite (continuous X). We will explain this discrepancy between theory and experiments, and clarify what the estimated mutual information curves from past works were actually tracking.

 

To this end, an auxiliary (noisy) DNN framework will be introduced, in which I(X;T) is a meaningful quantity that depends on the network's parameters. We will show that this noisy framework is a good proxy for the original (deterministic) system both in terms of performance and the learned representations. To accurately track I(X;T) over noisy DNNs, a differential entropy estimator tailored to exploit the DNN's layered structure will be proposed and theoretical guarantees on the associated minimax risk will be provided. Using this estimator along with a certain analogy to an information-theoretic communication problem, we will unveil the geometric mechanism that drives compression of I(X;T) in noisy DNNs. Based on these findings, we will circle back to deterministic networks and demonstrate that the past observations of compression were in fact tracking the same geometric phenomenon. Future research directions inspired by this study aiming to facilitate a comprehensive information-theoretic understanding of deep learning will also be discussed. 

 

Bio:

Dr. Ziv Goldfeld is currently a postdoctoral fellow at the Laboratory for Information and Decision Systems (LIDS) at MIT. He graduated with a B.Sc. summa cum laude, an M.Sc. summa cum laude and a Ph.D. in Electrical and Computer Engineering from Ben-Gurion University, Israel, in 2012, 2014 and 2017, respectively. His research interest include theoretical machine learning, information theory, complex systems, high-dimensional and nonparametric statistics and applied probability. Honors include the Rothschild postdoctoral fellowship, the Feder Award, a best student paper award in the IEEE 28th Convention of Electrical and Electronics Engineers in Israel, B.Sc. and M.Sc. Dean's Honors, the Basor fellowship, the Lev-Zion fellowship and the Minerva Short-Term Research Grant (MRG).

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש
שנעשה בתכנים אלה לדעתך מפר זכויות, נא לפנות בהקדם לכתובת שכאן >>