Ben Ohayon- Categorical Lip Motion from Sub-phonemic Unit Streams
סמינר מחלקת מערכות - EE Systems Seminar
Electrical Engineering Systems Zoom Seminar
Join Zoom Meeting
https://us06web.zoom.us/j/87459040800?pwd=pzsIdtKbkXBoCMwNcbp78MJklOzFBM.1
Meeting ID: 874 5904 0800
Passcode: 9ggDx3
Speaker: Ben Ohayon
M.Sc. student under the supervision of Dr. Dan Raviv
Wednesday, 31st January 2024, at 14:00
Categorical Lip Motion from Sub-phonemic Unit Streams
Abstract
The mapping between speech and facial expressions stands at the core of most audio-driven facial animation methods. Over the years, the many-to-many nature of this mapping remained a primary cause for lip-sync inaccuracies at the final animation. Current research resolves speech-expression ambiguities by increasing data volume and quality, reformulating the task as a sequence-to-sequence problem, or incorporating semantic information that discriminates between similar phonetic context. Despite these efforts, most data-driven approaches resort to regression-based modeling which tends to converge to the average solution during optimization. To overcome this limitation, we propose to use a classification objective over predicted categorical distributions of the lip's geometric structure. By using an online stream of sub-phonemic embeddings we model voice-lip geometry relationships in a probabilistic framework and enable expressive real-time audio-driven lip motion generation.
השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום בצ'אט של שם מלא + מספר ת.ז.

