Improving the generative performance of chemical autoencoders through transfer learning | Zendy

Nicolae C. Iovanac | Zendy; Brett M. Savoie | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Improving the generative performance of chemical autoencoders through transfer learning

Author(s) -

Nicolae C. Iovanac,

Brett M. Savoie

Publication year - 2020

Publication title -

machine learning science and technology

Language(s) - English

Resource type - Journals

ISSN - 2632-2153

DOI - 10.1088/2632-2153/abae75

Subject(s) - property (philosophy) , computer science , machine learning , artificial intelligence , generative grammar , transfer of learning , generative model , task (project management) , set (abstract data type) , representation (politics) , class (philosophy) , philosophy , management , epistemology , politics , economics , political science , law , programming language

Generative models are a sub-class of machine learning models that are capable of generating new samples with a target set of properties. In chemical and materials applications, these new samples might be drug targets, novel semiconductors, or catalysts constrained to exhibit an application-specific set of properties. Given their potential to yield high-value targets from otherwise intractable design spaces, generative models are currently under intense study with respect to how predictions can be improved through changes in model architecture and data representation. Here we explore the potential of multi-task transfer learning as a complementary approach to improving the validity and property specificity of molecules generated by such models. We have compared baseline generative models trained on a single property prediction task against models trained on additional ancillary prediction tasks and observe a generic positive impact on the validity and specificity of the multi-task models. In particular, we observe that the validity of generated structures is strongly affected by whether or not the models have chemical property data, as opposed to only syntactic structural data, supplied during learning. We demonstrate this effect in both interpolative and extrapolative scenarios (i.e., where the generative targets are poorly represented in training data) for models trained to generate high energy structures and models trained to generated structures with targeted bandgaps within certain ranges. In both instances, the inclusion of additional chemical property data improves the ability of models to generate valid, unique structures with increased property specificity. This approach requires only minor alterations to existing generative models, in many cases leveraging prediction frameworks already native to these models. Additionally, the transfer learning strategy is complementary to ongoing efforts to improve model architectures and data representation and can foreseeably be stacked on top of these developments.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research