Optimization in High Dimensions via Accelerated, Parallel, and Proximal Coordinate Descent
Author(s) -
Olivier Fercoq,
Peter Richtárik
Publication year - 2016
Publication title -
siam review
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 4.683
H-Index - 120
eISSN - 1095-7200
pISSN - 0036-1445
DOI - 10.1137/16m1085905
Subject(s) - coordinate descent , lipschitz continuity , mathematics , separable space , coordinate space , degree (music) , omega , gradient descent , combinatorics , convex function , bottleneck , regular polygon , algorithm , computer science , mathematical analysis , geometry , physics , artificial intelligence , quantum mechanics , artificial neural network , acoustics , embedded system
We propose a new randomized coordinate descent method for minimizing the sum of convex functions, each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel, and PROXimal; this is the first time such a method has been proposed. In the special case when the number of processors is equal to the number of coordinates, the method converges at the rate $2\bar{\omega}\bar{L} R^2/(k+1)^2 $, where $k$ is the iteration counter, $\bar{\omega}$ is a data-weighted average degree of separability of the loss function, $\bar{L}$ is the average of Lipschitz constants associated with the coordinates and individual functions in the sum, and $R$ is the distance of the initial point from the minimizer. We show that the method can be implemented without the need to perform full-dimensional vector operations, which is the major bottleneck of accelerated coordinate descent, rendering it impractical. The fact that the method depends on the average degree of separabili...
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom