Adaptive Stochastic Variance Reduction for Subsampled Newton Method with Cubic Regularization | Zendy

Junyu Zhang | Zendy; Lin Xiao | Zendy; Shuzhong Zhang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Adaptive Stochastic Variance Reduction for Subsampled Newton Method with Cubic Regularization

Author(s) -

Junyu Zhang,

Lin Xiao,

Shuzhong Zhang

Publication year - 2021

Publication title -

informs journal on optimization

Language(s) - English

Resource type - Journals

eISSN - 2575-1492

pISSN - 2575-1484

DOI - 10.1287/ijoo.2021.0058

Subject(s) - hessian matrix , variance reduction , mathematics , sublinear function , regularization (linguistics) , reduction (mathematics) , newton's method , mathematical optimization , combinatorics , computer science , statistics , monte carlo method , nonlinear system , physics , geometry , artificial intelligence , quantum mechanics

The cubic regularized Newton method of Nesterov and Polyak has become increasingly popular for nonconvex optimization because of its capability of finding an approximate local solution with a second order guarantee and its low iteration complexity. Several recent works extend this method to the setting of minimizing the average of N smooth functions by replacing the exact gradients and Hessians with subsampled approximations. It is shown that the total Hessian sample complexity can be reduced to be sublinear in N per iteration by leveraging stochastic variance reduction techniques. We present an adaptive variance reduction scheme for a subsampled Newton method with cubic regularization and show that the expected Hessian sample complexity is [Formula: see text] for finding an [Formula: see text]-approximate local solution (in terms of first and second order guarantees, respectively). Moreover, we show that the same Hessian sample complexity is retained with fixed sample sizes if exact gradients are used. The techniques of our analysis are different from previous works in that we do not rely on high probability bounds based on matrix concentration inequalities. Instead, we derive and utilize new bounds on the third and fourth order moments of the average of random matrices, which are of independent interest on their own.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore