Decentralised federated learning with adaptive partial gradient aggregation | Zendy

Jiang Jingyan | Zendy; Hu Liang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Decentralised federated learning with adaptive partial gradient aggregation

Author(s) -

Jiang Jingyan,

Hu Liang

Publication year - 2020

Publication title -

caai transactions on intelligence technology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.613

H-Index - 15

ISSN - 2468-2322

DOI - 10.1049/trit.2020.0082

Subject(s) - computer science , stochastic gradient descent , federated learning , gradient descent , distributed learning , node (physics) , convergence (economics) , latency (audio) , adaptive learning , distributed computing , rate of convergence , bandwidth (computing) , artificial intelligence , computer network , artificial neural network , engineering , telecommunications , psychology , pedagogy , channel (broadcasting) , economics , economic growth , structural engineering

Federated learning aims to collaboratively train a machine learning model with possibly geo‐distributed workers, which is inherently communication constrained. To achieve communication efficiency, the conventional federated learning algorithms allow the worker to decrease the communication frequency by training the model locally for multiple times. Conventional federated learning architecture, inherited from the parameter server design, relies on highly centralised topologies and large nodes‐to‐server bandwidths, and convergence property relies on the stochastic gradient descent training in local, which usually causes the large end‐to‐end training latency in real‐world federated learning scenarios. Thus, in this study, the authors propose the adaptive partial gradient aggregation method, a gradient partial level decentralised federated learning, to tackle this problem. In FedPGA, they propose a partial gradient exchange mechanism that makes full use of node‐to‐node bandwidth for speeding up the communication time. Besides, an adaptive model updating method further reduces the convergence rate by adaptive increasing the step size of the stable direction of gradient descent. The experimental results on various datasets demonstrate that the training time is reduced up to 14 × compared to baselines without accuracy degrade.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research