
Learning an Optimal Operational Strategy for Service Life Extension of Gear Wheels with Double Deep Q Networks
Author(s) -
Tamer Tevetoglu,
Mark Henss,
Yvonne Gretzinger,
Bernd Bertsche
Publication year - 2021
Publication title -
proceedings of the annual conference of the prognostics and health management society
Language(s) - English
Resource type - Journals
ISSN - 2325-0178
DOI - 10.36001/phmconf.2021.v13i1.2978
Subject(s) - torque , service life , computer science , reduction (mathematics) , position (finance) , mechanism (biology) , reinforcement learning , engineering , structural engineering , mechanical engineering , artificial intelligence , mathematics , philosophy , physics , geometry , finance , epistemology , economics , thermodynamics
One failure mechanism of gear wheels is pitting. If the gear wheel is case hardened, pitting degradation dominates normally at one tooth only. All the other teeth are still intact at the standardized end of life criterion of 4 % pitting area based on the total tooth area.
Using an operational strategy that was developed at the Institute of Machine Components, the service life of gear wheels can be extended by a local stress reduction at the weakest tooth. This is accomplished by applying an adapted torque at the transmission input that shifts a minimum torque in the area of the pre-damaged, and thus, weakest tooth. Consequently, all remaining teeth with higher load bearing capacity are subjected to higher torque. Prerequisite for the described theoretical operational strategy is knowledge on pitting-size and -position. The detection of these properties in operation is not state of the art yet.
In this work, only the gearbox vibration signal is known without explicit knowledge about the inside pitting. So the challenge is to determine the health for each individual tooth and to choose an optimal adapted torque based on this. This is especially difficult due to differing growth rates of pittings on one individual gear wheel. Hence, different pittings dominate over the service life, which results in the need of a continuous optimization of the torque control.
Algorithms of Reinforcement Learning (RL) are particularly suitable for this challenge. In this branch of Machine Learning (ML), an agent interacts inside an environment and learns by getting rewards for taking actions at given states. In this study, the environment is a gearbox-simulation-model, the state is the current vibration signal, and the action is the chosen adapted torque. Thus, it is possible to let the algorithm learn the whole operational strategy, from online failure detection to an adapted torque at the transmission input.
The results of this study show the theoretical feasibility of the operational strategy using Double Deep Q Networks as the RL Algorithm. The algorithm is able to learn a suitable reaction to pittings that increase linearly or progressively at an early stage and therefore delays their growth within the defined limits. Thus, the lifetime of the gearbox is extended while maintaining the same total power of the gearbox. As an outlook, the results will be examined for their sensitivity on several influencing factors in a further study. The wider view is to use this simulation on a test rig and validate the results.