A unified convergence analysis for shuffling-type gradient methods

M. Nguyen, Lam; Tran-Dinh, Quoc; Phan, Dzung; Nguyen, Phuong Ha; van Dijk, Marten

L. M. Nguyen (Lam), Q. Tran-Dinh (Quoc), D.T. Phan (Dzung), P.H. Nguyen (Phuong Ha) and M.E. van Dijk (Marten)

2021-09-21

A unified convergence analysis for shuffling-type gradient methods

Journal of Machine Learning Research , Volume 22 p. 1- 44

In this paper, we propose a unified convergence analysis for a class of generic shuffling-type gradient methods for solving finite-sum optimization problems. Our analysis works with any sampling without replacement strategy and covers many known variants such as randomized reshuffling, deterministic or randomized single permutation, and cyclic and incremental gradient schemes. We focus on two different settings: strongly convex and nonconvex problems, but also discuss the non-strongly convex case. Our main contribution consists of new non-asymptotic and asymptotic convergence rates for a wide class of shuffling-type gradient methods in both nonconvex and convex settings. We also study uniformly randomized shuffling variants with different learning rates and model assumptions. While our rate in the nonconvex case is new and significantly improved over existing works under standard assumptions, the rate on the strongly convex one matches the existing best-known rates prior to this paper up to a constant factor without imposing a bounded gradient condition. Finally, we empirically illustrate our theoretical results via two numerical examples: nonconvex logistic regression and neural network training examples. As byproducts, our results suggest some appropriate choices for diminishing learning rates in certain shuffling variants.

Additional Metadata
Keywords	Stochastic gradient algorithm, Shuffling-type gradient scheme, Sampling without replacement, Non-convex finite-sum minimization, Strongly convex minimization
Stakeholder	IBM Research, Thomas J. Watson Research Center, USA , eBay Inc., San Jose, CA, USA
Journal	Journal of Machine Learning Research
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Nguyen, L., Tran-Dinh, Q., Phan, D., Nguyen, P. H., & van Dijk, M. (2021). A unified convergence analysis for shuffling-type gradient methods. Journal of Machine Learning Research, 22, 1–44.

Free Full Text ( Final Version , 1mb )

A unified convergence analysis for shuffling-type gradient methods

Publication

Publication

Address

CWI researchers

Questions or comments?

A unified convergence analysis for shuffling-type gradient methods

Publication

Publication

Workflow

Workflow

Add Content