EE Seminar: On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

29 באפריל 2026, 15:00 
חדר 011, בניין כיתות-חשמל 
EE Seminar: On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

הרישום לסמינר יבוצע בתחילת הסמינר באמצעות סריקת הברקוד למודל (יש להיכנס לפני כן למודל,  לא באמצעות האפליקציה)

Registration to the seminar is done at the beginning of the seminar by scanning the barcode for the Moodle (Please enter ahead to the Moodle, NOT by application)

 

Electrical Engineering Systems Seminar

 

Speaker: Yuval Koren

M.Sc. student under the supervision of Prof. Lior Wolf

 

Wednesday, 29th April 2026, at 15:00

Room 011, Kitot Building, Faculty of Engineering

 

On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

Abstract

Associative Recall (AR) is the cognitive ability to learn and retrieve links between items in memory. In Natural Language Processing, AR is used as a benchmark for evaluating the in-context memory capacity of architectures such as Mamba, and has been found to strongly correlate with language modeling performance. This work explores AR from the perspective of mechanistic interpretability, aiming to reverse-engineer the exact internal algorithm used by Mamba to perform recall. Our key insight is that Mamba performs recall by implicitly learning linear hash functions, and we identify the low-level circuit that enables this behavior. Building on these findings and inspired by theoretical tools in similarity-preserving hashing, such as the Johnson–Lindenstrauss lemma, we develop a theoretical framework for analyzing AR, which we term Recall Scaling Laws. Given the vocabulary size and the number of facts in context, this framework allows us to (1) predict the embedding and state dimensions required for Mamba to achieve perfect recall, (2) predict recall success probability given the model dimensions, and (3) analyze multi-layer models and multi-head SSM patterns. Empirical results show that our theoretical findings are accurate and predictive, offering insights into how AR capacity scales with vocabulary, state, embedding size, and architecture.

 

  -סמינר זה ייחשב כסמינר שמיעה לתלמידי תואר שני ושלישי-

This Seminar Is Considered A Hearing Seminar For Msc/Phd Students-

 

 

אוניברסיטת תל אביב עושה כל מאמץ לכבד זכויות יוצרים. אם בבעלותך זכויות יוצרים בתכנים שנמצאים פה ו/או השימוש שנעשה בתכנים אלה לדעתך מפר זכויות
שנעשה בתכנים אלה לדעתך מפר זכויות נא לפנות בהקדם לכתובת שכאן >>