Lipschitz Adaptivity with multiple learning rates in online learning

Mhammedi, Zakaria; Koolen-Wijkstra, Wouter; van Erven, Tim

Z. Mhammedi (Zakaria), W.M. Koolen-Wijkstra (Wouter) and T.A.L. van Erven (Tim)

2019-06-25

Lipschitz Adaptivity with multiple learning rates in online learning

Presented at the 32nd Annual Conference on Learning Theory (June 2019), Phoenix, Arizona, USA

We aim to design adaptive online learning algorithms that take advantage of any special structure that might be present in the learning task at hand, with as little manual tuning by the user as possible. A fundamental obstacle that comes up in the design of such adaptive algorithms is to calibrate a so-called step-size or learning rate hyperparameter depending on variance, gradient norms, etc. A recent technique promises to overcome this difficulty by maintaining multiple learning rates in parallel. This technique has been applied in the MetaGrad algorithm for online convex optimization and the Squint algorithm for prediction with expert advice. However, in both cases the user still has to provide in advance a Lipschitz hyperparameter that bounds the norm of the gradients. Although this hyperparameter is typically not available in advance, tuning it correctly is crucial: if it is set too small, the methods may fail completely; but if it is taken too large, performance deteriorates significantly. In the present work we remove this Lipschitz hyperparameter by designing new versions of MetaGrad and Squint that adapt to its optimal value automatically. We achieve this by dynamically updating the set of active learning rates. For MetaGrad, we further improve the computational efficiency of handling constraints on the domain of prediction, and we remove the need to specify the number of rounds in advance.

Additional Metadata
Conference	32nd Annual Conference on Learning Theory
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Mhammedi, Z., Koolen-Wijkstra, W., & van Erven, T. (2019). Lipschitz Adaptivity with multiple learning rates in online learning. In Proceedings of Machine Learning Research (pp. 1–22).

Free Full Text ( Final Version , 380kb )

Lipschitz Adaptivity with multiple learning rates in online learning

Publication

Publication

Address

CWI researchers

Questions or comments?

Lipschitz Adaptivity with multiple learning rates in online learning

Publication

Publication

Workflow

Workflow

Add Content