Extracting Predictive Representations from Hundreds of Millions of Molecules | Zendy

Dong Chen | Zendy; Jiaxin Zheng | Zendy; Guo-Wei Wei | Zendy; Feng Pan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Extracting Predictive Representations from Hundreds of Millions of Molecules

Author(s) -

Dong Chen,

Jiaxin Zheng,

Guo-Wei Wei,

Feng Pan

Publication year - 2021

Publication title -

the journal of physical chemistry letters

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 2.563

H-Index - 203

ISSN - 1948-7185

DOI - 10.1021/acs.jpclett.1c03058

Subject(s) - computer science , task (project management) , machine learning , artificial intelligence , process (computing) , virtual screening , labeled data , supervised learning , data mining , drug discovery , bioinformatics , artificial neural network , management , economics , biology , operating system

The construction of appropriate representations remains essential for molecular predictions due to intricate molecular complexity. Additionally, it is often expensive and ethically constrained to generate labeled data for supervised learning in molecular sciences, leading to challenging small and diverse data sets. In this work, we develop a self-supervised learning approach to pretrain models from over 700 million unlabeled molecules in multiple databases. The intrinsic chemical logic learned from this approach enables the extraction of predictive representations from task-specific molecular sequences in a fine-tuned process. To understand the importance of self-supervised learning from unlabeled molecules, we assemble three models with different combinations of databases. Moreover, we propose a protocol based on data traits to automatically select the optimal model for a specific task. To validate the proposed method, we consider 10 benchmarks and 38 virtual screening data sets. Extensive validation indicates that the proposed method shows superb performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore