סמינר של הפקולטה להנדסה ע"ש איבי ואלדר פליישמן

EE Seminar: On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

29 באפריל 2026, 15:00

חדר 011, בניין כיתות-חשמל

EE Seminar: On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

הרישום לסמינר יבוצע בתחילת הסמינר באמצעות סריקת הברקוד למודל (יש להיכנס לפני כן למודל, לא באמצעות האפליקציה)

Registration to the seminar is done at the beginning of the seminar by scanning the barcode for the Moodle (Please enter ahead to the Moodle, NOT by application)

Electrical Engineering Systems Seminar

Speaker: Yuval Koren

M.Sc. student under the supervision of Prof. Lior Wolf

Wednesday, 29^th April 2026, at 15:00

Room 011, Kitot Building, Faculty of Engineering

On the Recall Scaling Laws in Mamba: A Theoretical and Mechanistic Study via Hashing

Abstract

Associative Recall (AR) is the cognitive ability to learn and retrieve links between items in memory. In Natural Language Processing, AR is used as a benchmark for evaluating the in-context memory capacity of architectures such as Mamba, and has been found to strongly correlate with language modeling performance. This work explores AR from the perspective of mechanistic interpretability, aiming to reverse-engineer the exact internal algorithm used by Mamba to perform recall. Our key insight is that Mamba performs recall by implicitly learning linear hash functions, and we identify the low-level circuit that enables this behavior. Building on these findings and inspired by theoretical tools in similarity-preserving hashing, such as the Johnson–Lindenstrauss lemma, we develop a theoretical framework for analyzing AR, which we term Recall Scaling Laws. Given the vocabulary size and the number of facts in context, this framework allows us to (1) predict the embedding and state dimensions required for Mamba to achieve perfect recall, (2) predict recall success probability given the model dimensions, and (3) analyze multi-layer models and multi-head SSM patterns. Empirical results show that our theoretical findings are accurate and predictive, offering insights into how AR capacity scales with vocabulary, state, embedding size, and architecture.

-סמינר זה ייחשב כסמינר שמיעה לתלמידי תואר שני ושלישי-

This Seminar Is Considered A Hearing Seminar For Msc/Phd Students-