EE Seminar: Analog Processing-in-Memory of Deep Neural Networks

24 ביוני 2024, 12:00 
חדר 011, בניין כיתות-חשמל 
EE Seminar: Analog Processing-in-Memory of Deep Neural Networks

(The talk will be given in English)

 

Speaker:     Dr. Tzofnat Greenberg, SoC Architect at NVIDIA

 

011 hall, Electrical Engineering-Kitot Building‏

Monday, June 24th, 2024

12:00 - 13:00

 

Analog Processing-in-Memory of Deep Neural Networks

 

Abstract

Deep neural networks are usually executed by commodity hardware, mostly GPU platforms, and accelerators (such as Google's TPU), as they are compute and memory-intensive. In this talk, I will discuss the acceleration of DNNs using emerging memristive memory technologies, such as RRAM and STT-MRAM (also known as memristors). Memristors enable combining storage and computation on one physical entity. They enable computing an energy-efficient and highly parallel analog multiply-and-accumulate operation in-place, also known as processing-in-memory (PIM). To fulfill PIM’s potential, we consider hardware-based and algorithm-based approaches to accelerate DNNs.

First, we examine an on-chip training setup where an analog PIM-based accelerator supports DNNs' training and inference. Advanced optimization algorithms may only partially utilize the PIM module parallelism, leading to suboptimal hardware performance. In this study, we investigate and evaluate PIM implementations of (1) Stochastic gradient descent with momentum and (2) Quantize neural network (QNNs) training.

We also investigate an off-chip regime in which a pre-trained model is given and deployed to an analog PIM-based accelerator. Naively deploying a pre-trained model to the analog hardware will lead to significant accuracy degradation; therefore, we suggest the analog-aware post-training (APT) approach to calibrate the model to be more robust to analog noise. Our evaluation of several DNN models on the ImageNet dataset showed that our APT approach achieves similar accuracy as previous state-of-the-art analog-aware training methods, requiring less than 1% of the dataset to train and accelerating the noise adjustment time by up to 41X.

Short Bio

Tzofnat recently finished her Ph.D. under the supervision of Prof. Shahar Kvatinsky and Daniel Soudry. She is currently working as an SoC architect at NVIDIA.

 

השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום שם מלא + מספר ת.ז. בטופס הנוכחות שיועבר באולם במהלך הסמינר

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש
שנעשה בתכנים אלה לדעתך מפר זכויות, נא לפנות בהקדם לכתובת שכאן >>