Online Convex Optimization with Unbounded Memory
Online convex optimization (OCO) is a widely used framework in online learning that has found recent success in the application to online control. However, the OCO framework and existing generalizations thereof can only approximately capture scenarios in which the learner is affected by the entire history of their decisions. In this talk, I will introduce a generalized framework "Online Convex Optimization with Unbounded Memory" and a notion of effective memory capacity that quantifies the maximum influence of past decisions. The standard FTRL algorithm achieves a regret bound of sqrt(effective memory capacity * T), where T is the time horizon. I will demonstrate the broad applicability of our framework by using it to derive regret bounds, and to simplify existing regret bound derivations, for a variety of online learning problems including online linear control and an online variant of performative prediction.