EE Seminar: Topics in Universal Learning
Electrical Engineering Systems Seminar
Speaker: Yaniv Fogel
Ph.D. student under the supervision of Prof. Meir Feder
Monday, 4th November 2024, at 12:00
Room 011, Kitot Building, Faculty of Engineering
Topics in Universal Learning
Abstract
We study statistical learning through the lens of information theory, using tools and methods established in the field of universal prediction. We focus on batch learning, where a training set of data features and corresponding outcomes are given, and the learner is tasked with predicting another outcome given a data feature. The prediction is measured using the logarithmic loss function, and is compared in some manner to an hypothesis class that consists of possible conditional probabilities of outcomes given the data features.
First, we consider the stochastic, realizable setting, where the outcomes are generated according to one of the hypotheses in the class. We show an equivalent of the Redundancy-Capacity Theorem, and utilize it to derive a general upper-bound over the regret. For Bernoulli hypothesis class we establish theoretical characterization of the min-max optimal learner. We propose and implement a variant of the Arimoto-Blahut algorithm to calculate the capacity-achieving prior and min-max optimal learner
We then consider the individual setting where the outcome sequence is deterministic, arbitrary. As noted by previous works, it is challenging to define an individual batch learning problem. We consider three possible definitions: The first two lead to min-max optimal learners which are known variants of the NML. We show that these learners obtain favorable results for several hypothesis classes yet might fail to learn some specific hypothesis classes which are in fact learnable. We then propose a third definition, whose min-max optimal learner does not have a closed form expression. Nevertheless, we show both upper and lower bounds over its regret, and show that the min-max regret vanishes for every hypothesis class of finite VC-dimension.
Last, we consider the case where there are several hypothesis classes that might be nested In an hierarchical structure. In this case we propose several definitions for the regret and discuss their properties. We show how these definitions can lead to Elias’ codes for universal representation of the integers. We also use this framework to discuss the hypothesis classes of varying order Markov models.
השתתפות בסמינר תיתן קרדיט שמיעה = עפ"י רישום שם מלא + מספר ת.ז. בדף הנוכחות שיועבר באולם במהלך הסמינר