
Descriptors of atoms and structure information for predicting properties of crystalline materials
Author(s) -
Jonggul Lee,
Jung Hoon Shin,
Tae-Wook Ko,
SeungHee Lee,
Hyunju Chang,
YunKyong Hyon
Publication year - 2021
Publication title -
materials research express
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.383
H-Index - 35
ISSN - 2053-1591
DOI - 10.1088/2053-1591/abe2d5
Subject(s) - interpretability , computer science , representation (politics) , exploit , graph , artificial intelligence , crystal structure prediction , machine learning , field (mathematics) , convolutional neural network , set (abstract data type) , data mining , theoretical computer science , crystal structure , mathematics , programming language , chemistry , computer security , politics , pure mathematics , political science , law , crystallography
Machine learning (ML) has increasingly been of interest in the design of new materials. However, it is still challenging to exploit an ML model in this field because its performance highly depends on the representation of materials, its properties, and the amount of data. In this study, for the cases of prediction of properties of crystalline materials, we explore a systematic comparison of two state-of-the-art frameworks: Crystal Graph Convolutional Neural Networks (CGCNNs) and the Sure Independence Screening and Sparsifying Operator (SISSO). The common key advantage of these two models is the fact that painstakingly handcrafted descriptors from simple material properties are not required. The main differences between the two models are (1) the use of structure information in the arbitrary size of compounds (CGCNN) and (2) limited interpretability (CGCNN) but simple and analytic relations between descriptor-property (SISSO). Using these two ML algorithms we evaluate the prediction performance on the target properties, which are band gap, formation energy, and elasticity of crystalline compounds in the database of Materials Project (MP). Moreover, to improve prediction of the properties of the materials without human bias in the selection of initial atomic features for the CGCNNs, we use Atom2Vec that provides atom representation obtained in an unsupervised manner from the materials. We also perform the predictions with the different sizes of training set to investigate the data-size dependency of the predictive models. According to the amount of dataset, the use of structural information, and the ability to identify the best descriptor with its interpretability, these algorithms showed different prediction performances. This result will enable researchers in materials discovery to gain appropriate choices and insights in various attempts to improve the prediction performance of crystalline materials’ properties.