Reinforcement Learning for Adaptive Caching With Dynamic Storage Pricing | Zendy

Alireza Sadeghi | Zendy; Fatemeh Sheikholeslami | Zendy; Antonio G. Marqués | Zendy; Georgios B. Giannakis | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Reinforcement Learning for Adaptive Caching With Dynamic Storage Pricing

Author(s) -

Alireza Sadeghi,

Fatemeh Sheikholeslami,

Antonio G. Marqués,

Georgios B. Giannakis

Publication year - 2019

Publication title -

ieee journal on selected areas in communications

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 2.986

H-Index - 236

eISSN - 1558-0008

pISSN - 0733-8716

DOI - 10.1109/jsac.2019.2933780

Subject(s) - computer science , reinforcement learning , cache , solver , markov decision process , dynamic pricing , edge device , dynamic programming , distributed computing , cloud computing , mathematical optimization , computer network , algorithm , markov process , machine learning , statistics , mathematics , marketing , business , programming language , operating system

Small base stations (SBs) of fifth-generation (5G) cellular networks are envisioned to have storage devices to locally serve requests for reusable and popular contents by emph{caching} them at the edge of the network, close to the end users. The ultimate goal is to shift part of the predictable load on the back-haul links, from on-peak to off-peak periods, contributing to a better overall network performance and service experience. To enable the SBs with efficient textit{fetch-cache} decision-making schemes operating in dynamic settings, this paper introduces simple but flexible generic time-varying fetching and caching costs, which are then used to formulate a constrained minimization of the aggregate cost across files and time. Since caching decisions per time slot influence the content availability in future slots, the novel formulation for optimal fetch-cache decisions falls into the class of dynamic programming. Under this generic formulation, first by considering stationary distributions for the costs and file popularities, an efficient reinforcement learning-based solver known as value iteration algorithm can be used to solve the emerging optimization problem. Later, it is shown that practical limitations on cache capacity can be handled using a particular instance of the generic dynamic pricing formulation. Under this setting, to provide a light-weight online solver for the corresponding optimization, the well-known reinforcement learning algorithm, $Q$-learning, is employed to find optimal fetch-cache decisions. Numerical tests corroborating the merits of the proposed approach wrap up the paper.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research