Online Learning of Noisy Functions via a Data-Regularized Gradient-Descent Approach
Author(s) -
Farzaneh Tatari,
Ramin Esmzad,
Hamidreza Modares
Publication year - 2025
Publication title -
ieee transactions on systems, man, and cybernetics: systems
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 2.261
H-Index - 64
eISSN - 2168-2232
pISSN - 2168-2216
DOI - 10.1109/tsmc.2025.3614871
Subject(s) - signal processing and analysis , robotics and control systems , power, energy and industry applications , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , general topics for engineers
Online first-order algorithms for function identification and regression with noisy data often rely on replacing actual gradients with their constructed noisy estimates. Stochastic gradient descent (SGD) and its mini-batch variant rely on a single sample or a set of randomly selected samples, respectively, to estimate the noisy gradient. SGD algorithms with a constant learning rate for functions that are not strongly convex converge sublinearly. In this article, we show that the strong convexity requirement is satisfied for time-varying regressors if a strong data richness condition, that is, the persistence of excitation (PE) condition, is satisfied on the collected data. To improve the convergence under easy-to-verify data requirements, an online data-regularized concurrent learning-based SGD (CL-based SGD) with a fixed learning rate is then presented for function approximation with noisy data. First, instead of randomly selecting a mini-batch of data, as the mini-batch SGD does, a fixed-size memory of past experiences is repeatedly used in the update law along with the current streaming data. Then, a data selection strategy is used to provide probabilistic convergence guarantees with a highly improved convergence rate (i.e., linear instead of sublinear) to a narrow bound. We finally leverage the Lyapunov theory to provide probabilistic guarantees that assure convergence of the parameters to a probabilistic ultimate bound exponentially fast, provided that a rank condition on the stored data is satisfied. It is shown that the ultimate bound and the exponential convergence to a bounded error region with high probability depend on the condition number of the recorded data matrix. This analysis shows how the quality of the memory data affects the ultimate bound and can reduce the effects of the noise variance on the error bounds. Simulation examples verify the effectiveness of the presented learning approach.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom