Open Access
Adaptive Stochastic Variance Reduction for Subsampled Newton Method with Cubic Regularization
Author(s) -
Junyu Zhang,
Lin Xiao,
Shuzhong Zhang
Publication year - 2022
Publication title -
informs journal on optimization
Language(s) - English
Resource type - Journals
eISSN - 2575-1492
pISSN - 2575-1484
DOI - 10.1287/ijoo.2021.0058
Subject(s) - hessian matrix , variance reduction , mathematics , sublinear function , regularization (linguistics) , reduction (mathematics) , newton's method , mathematical optimization , combinatorics , computer science , statistics , monte carlo method , nonlinear system , physics , geometry , artificial intelligence , quantum mechanics
The cubic regularized Newton method of Nesterov and Polyak has become increasingly popular for nonconvex optimization because of its capability of finding an approximate local solution with a second order guarantee and its low iteration complexity. Several recent works extend this method to the setting of minimizing the average of N smooth functions by replacing the exact gradients and Hessians with subsampled approximations. It is shown that the total Hessian sample complexity can be reduced to be sublinear in N per iteration by leveraging stochastic variance reduction techniques. We present an adaptive variance reduction scheme for a subsampled Newton method with cubic regularization and show that the expected Hessian sample complexity is [Formula: see text] for finding an [Formula: see text]-approximate local solution (in terms of first and second order guarantees, respectively). Moreover, we show that the same Hessian sample complexity is retained with fixed sample sizes if exact gradients are used. The techniques of our analysis are different from previous works in that we do not rely on high probability bounds based on matrix concentration inequalities. Instead, we derive and utilize new bounds on the third and fourth order moments of the average of random matrices, which are of independent interest on their own.