Imbalance-Aware Scheduling for PV-Battery Storage Systems Using Deep Reinforcement Learning
Author(s) -
Yuki Osone,
Daisuke Kodaira
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3615960
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
The growing penetration of distributed renewable energy resources necessitates more intelligent and adaptive energy management strategies. In this paper, we propose a novel imbalance-aware control framework for photovoltaic-battery storage systems (PV-BSS) participating in day-ahead electricity markets characterized by strict penalty mechanisms, such as those in Japan. The core of the framework is a Proximal Policy Optimization (PPO)-based deep reinforcement learning (DRL) agent, which is explicitly trained to minimize imbalance penalties by embedding forecast deviations into the reward function. To enhance operational feasibility under real-world constraints, the PPO agent is complemented by a Model Predictive Control (MPC) layer that refines actions in real time based on updated forecasts and system constraints. The proposed framework integrates probabilistic PV forecasting using Lower-Upper Bound Estimation (LUBE) and electricity price prediction via multi-layer perceptron (MLP) models within a unified control loop. Through extensive simulations using actual Japanese market data, the method demonstrates a 47% reduction in imbalance penalties compared to the rule-based strategy and a 26% reduction compared to the DRL model without imbalance awareness. These results highlight the proposed method’s potential for economically efficient and regulation-compliant scheduling in dynamic and penalty-intensive electricity markets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom