2019-01-22
Finite-Sum Smooth Optimization with SARAH
Publication
Publication
The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function F(w)=1n∑ni=1fi(w) has been proven to be at least Ω(n−−√/ϵ) for n≤O(ϵ−2) where ϵ denotes the attained accuracy E[∥∇F(w~)∥2]≤ϵ for the outputted approximation w~ (Fang et al., 2018). In this paper, we provide a convergence analysis for a slightly modified version of the SARAH algorithm (Nguyen et al., 2017a;b) and achieve total complexity that matches the lower-bound worst case complexity in (Fang et al., 2018) up to a constant factor when n≤O(ϵ−2) for nonconvex problems. For convex optimization, we propose SARAH++ with sublinear convergence for general convex and linear convergence for strongly convex problems; and we provide a practical version for which numerical experiments on various datasets show an improved performance.
Additional Metadata | |
---|---|
IBM Research, Thomas J. Watson Research Center, USA | |
Organisation | Cryptology |
Nguyen, L., van Dijk, M., Phan, D., Nguyen, P. H., Weng, T.-W., & Kalagnanam, J. (2019). Finite-Sum Smooth Optimization with SARAH. |