Metagrad: Adaptation using multiple learning rates in online learning

van Erven, Tim; Koolen-Wijkstra, Wouter; van der Hoeven, Dirk

T.A.L. van Erven (Tim), W.M. Koolen-Wijkstra (Wouter) and D. van der Hoeven (Dirk)

2021-07-01

Metagrad: Adaptation using multiple learning rates in online learning

Journal of Machine Learning Research , Volume 22 p. 1- 61

We provide a new adaptive method for online convex optimization, MetaGrad, that is ro- bust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connec- tion to the Bernstein condition, which is known to imply fast rates in offline statistical learning. MetaGrad further adapts automatically to the size of the gradients. Its main fea- ture is that it simultaneously considers multiple learning rates, which are weighted directly proportional to their empirical performance on the data using a new meta-algorithm. We provide three versions of MetaGrad. The full matrix version maintains a full covariance matrix and is applicable to learning tasks for which we can afford update time quadratic in the dimension. The other two versions provide speed-ups for high-dimensional learning tasks with an update time that is linear in the dimension: one is based on sketching, the other on running a separate copy of the basic algorithm per coordinate. We evaluate all versions of MetaGrad on benchmark online classification and regression tasks, on which they consistently outperform both online gradient descent and AdaGrad.

Additional Metadata
Keywords	Online convex optimization, Adaptivity
Journal	Journal of Machine Learning Research
Organisation	Machine Learning
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	van Erven, T., Koolen-Wijkstra, W., & van der Hoeven, D. (2021). Metagrad: Adaptation using multiple learning rates in online learning. Journal of Machine Learning Research, 22, 1–61.

Free Full Text ( Final Version , 630kb )

See Also
inProceedings MetaGrad: multiple learning rates in online learning T.A.L. van Erven (Tim) and W.M. Koolen-Wijkstra (Wouter)

Metagrad: Adaptation using multiple learning rates in online learning

Publication

Publication

inProceedings
MetaGrad: multiple learning rates in online learning

Address

CWI researchers

Questions or comments?

Metagrad: Adaptation using multiple learning rates in online learning

Publication

Publication

inProceedings MetaGrad: multiple learning rates in online learning

Workflow

Workflow

Add Content

inProceedings
MetaGrad: multiple learning rates in online learning