Reinforcement learning in discrete action space applied to inverse defect design | Zendy

Troy D. Loeffler | Zendy; Suvo Banik | Zendy; Tarak K. Patra | Zendy; Michael Sternberg | Zendy; Subramanian K. R. S. Sankaranarayanan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Reinforcement learning in discrete action space applied to inverse defect design

Author(s) -

Troy D. Loeffler,

Suvo Banik,

Tarak K. Patra,

Michael Sternberg,

Subramanian K. R. S. Sankaranarayanan

Publication year - 2021

Publication title -

journal of physics communications

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.407

H-Index - 17

ISSN - 2399-6528

DOI - 10.1088/2399-6528/abe591

Subject(s) - inverse , reinforcement learning , computer science , convergence (economics) , action (physics) , hyperparameter , space (punctuation) , genetic algorithm , tree (set theory) , mathematical optimization , algorithm , artificial intelligence , mathematics , machine learning , physics , mathematical analysis , geometry , quantum mechanics , economics , economic growth , operating system

Reinforcement learning (RL) algorithms that include Monte Carlo Tree Search (MCTS) have found tremendous success in computer games such as Go, Shiga and Chess. Such learning algorithms have demonstrated super-human capabilities in navigating through an exhaustive discrete action search space. Motivated by their success in computer games, we demonstrate that RL can be applied to inverse materials design problems. We deploy RL for a representative case of the optimal atomic scale inverse design of extended defects via rearrangement of chalcogen (e.g. S) vacancies in 2D transition metal dichalcogenides (e.g. MoS 2 ). These defect rearrangements and their dynamics are important from the perspective of tunable phase transition in 2D materials i.e. 2H (semi-conducting) to 1T (metallic) in MoS 2 . We demonstrate the ability of MCTS interfaced with a reactive molecular dynamics simulator to efficiently sample the defect phase space and perform inverse design—starting from randomly distributed S vacancies, the optimal defect rearrangement of defects corresponds a line defect of S vacancies. We compare MCTS performance with evolutionary optimization i.e. genetic algorithms and show that MCTS converges to a better optimal solution (lower objective) and in fewer evaluations compared to GA. We also comprehensively evaluate and discuss the effect of MCTS hyperparameters on the convergence to solution. Overall, our study demonstrates the effectives of using RL approaches that operate in discrete action space for inverse defect design problems.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research