Tracking Control of CSTRs Based on Improved OU Noise and the TD3 Algorithm | Zendy

Hongyan Shi | Zendy; Xiaofei Wu | Zendy; Guogang Wang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Tracking Control of CSTRs Based on Improved OU Noise and the TD3 Algorithm

Author(s) -

Hongyan Shi,

Xiaofei Wu,

Guogang Wang

Publication year - 2025

Publication title -

ieee access

Language(s) - English

Resource type - Magazines

SCImago Journal Rank - 0.587

H-Index - 127

eISSN - 2169-3536

DOI - 10.1109/access.2025.3574730

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

Achieving satisfactory control accuracy in typical chemical processes is a challenging task. This is primarily due to their complex nonlinear dynamic characteristics. To address this challenge, this study proposes a novel deep reinforcement learning (DRL) approach that integrates an improved Ornstein-Uhlenbeck (IOU) noise into the twin delayed deep deterministic policy gradient (TD3) algorithm. This method is applied to the tracking control of continuous stirred tank reactors (CSTRs). Initially, a mechanistic model of the CSTRs system is established to simulate its dynamic environment, enabling interaction between the TD3 agent and the system. To enhance exploration capabilities and convergence speed, fractional-order characteristics and a reward feedback mechanism are introduced into the OU noise, dynamically adjusting noise intensity to improve adaptability to complex states and optimize the exploration strategy of the TD3 algorithm. Furthermore, a well-designed reward function and optimized hyperparameters enable the agent to efficiently learn the optimal control policy, achieving high-precision tracking control of the CSTRs system. Simulation results demonstrate that the proposed TD3 algorithm outperforms conventional control methods, such as PID control and nonlinear model predictive control, as well as other DRL algorithms, including SAC and PPO, in terms of control accuracy and convergence speed.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research