EE Seminar: Learning to see by listening

~~ (The talk will be given in English)

Speaker:   Prof. William T. Freeman
                        Massachusetts Institute of Technology and Google

Monday, May 23rd, 2016
15:00 - 16:00
Room 011, Kitot Bldg., Faculty of Engineering

Learning to see by listening

Abstract
Children may learn about the world by pushing, banging, and manipulating things, watching and listening as materials make their distinctive sounds-- dirt makes a thud; ceramic makes a clink. These sounds reveal physical properties of the objects, as well as the force and motion of the physical interaction. We've explored a toy version of such learning-through-interaction by recording audio and video while we hit many things with a drumstick.
We developed an algorithm the predict sounds from silent videos of the drumstick interactions. The algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We demonstrate that the sounds generated by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that the task of predicting sounds allows our system to learn to visually distinguish different materials.
Joint work with: Andrew Owens, Phillip Isola, Josh McDermott, Antonio Torralba, Edward H. Adelson http://arxiv.org/abs/1512.08512 to appear in CVPR 2016

Bio: 
William T. Freeman is the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) there. He was the Associate Department Head from 2011 - 2014. His current research interests include machine learning applied to computer vision, Bayesian models of visual perception, and computational photography. He received outstanding paper awards at computer vision or machine learning conferences in 1997, 2006, 2009 and 2012, and test-of-time awards for papers from 1990 and 1995. Previous research topics include steerable filters and pyramids, orientation histograms, the generic viewpoint assumption, color constancy, computer vision for computer games, and belief propagation in networks with loops. He is active in the program or organizing committees of computer vision, graphics, and machine learning conferences. He was the program co-chair for ICCV 2005, and for CVPR 2013.

 

23 במאי 2016, 15:00 
חדר 011, בניין כיתות-חשמל 
אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש
שנעשה בתכנים אלה לדעתך מפר זכויות, נא לפנות בהקדם לכתובת שכאן >>