New perspective on the convergence to a global solution of finite-sum optimization
Deep neural networks have shown great success in many machine learning tasks. Their training is challenging since the loss surface of the network architecture is generally non-convex, or even non-smooth. We propose a reformulation of the minimization problem allowing for a new recursive algorithmic framework. By using bounded style assumptions, we prove convergence to an \epsilon-(global) minimum using O(1/\epsilon^3) gradient computations. Our theoretical foundation motivates further study, implementation, and optimization of the new algorithmic framework and further investigation of its non-standard bounded style assumptions.
|IBM Research, Thomas J. Watson Research Center, USA|
Nguyen, L. M, Tran, T.H, & van Dijk, M.E. (2022). New perspective on the convergence to a global solution of finite-sum optimization. In Informs Annual Meeting.
|View at homepage|