EE Seminar: Improved Bounds for Reward-Agnostic and Reward-Free Exploration
Electrical Engineering Systems Seminar
Speaker: Oran Ridel
M.Sc. student under the supervision of Dr. Alon Cohen
Monday, 15th June 2026, at 15:00
Room 011, Kitot Building, Faculty of Engineering
Improved Bounds for Reward-Agnostic and Reward-Free Exploration
Abstract
In standard reinforcement learning, an agent learns by trying to maximize a stream of rewards. However, in many practical applications, like scientific experiments or robotics, it is hard to define, or impossible to know, what the reward should be while the agent is first collecting data. This work looks at how an agent can effectively explore an unknown environment completely blind to what its ultimate goal will be. We want the agent to learn how the environment operates to the point that once a goal is introduced, it can choose the best actions without further exploration of the environment.
We are introducing a new approach that uses a sequence of carefully designed internal rewards to guide the agent toward efficient exploration. Relative to previous work, our approach significantly reduces the number of interactions the agent has with the environment for a given target accuracy. Additionally, we are providing a tight lower bound to the problem, which is closing the gap between theoretical and achievable performance.
This work has been accepted for ICML 2026 and can be found here:
-סמינר זה ייחשב כסמינר שמיעה לתלמידי תואר שני ושלישי-
This Seminar Is Considered A Hearing Seminar For Msc/Phd Students-
הרישום לסמינר יבוצע בתחילת הסמינר באמצעות סריקת הברקוד למודל (יש להיכנס לפני כן למודל, לא באמצעות האפליקציה)
Registration to the seminar is done at the beginning of the seminar by scanning the barcode for the Moodle (Please enter ahead to the Moodle, NOT by application)
.

