z-logo
open-access-imgOpen Access
ChemFixer: Correcting Invalid Molecules to Unlock Previously Unseen Chemical Space
Author(s) -
Jun-Hyoung Park,
Ho-Jun Song,
Seong-Whan Lee
Publication year - 2025
Publication title -
ieee journal of biomedical and health informatics
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 1.293
H-Index - 125
eISSN - 2168-2208
pISSN - 2168-2194
DOI - 10.1109/jbhi.2025.3593825
Subject(s) - bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , signal processing and analysis
Deep learning-based molecular generation models have shown great potential in efficiently exploring vast chemical spaces by generating potential drug candidates with desired properties. However, these models often produce chemically invalid molecules, which limits the usable scope of the learned chemical space and poses significant challenges for practical applications. To address this issue, we propose ChemFixer, a framework designed to correct invalid molecules into valid ones. Chem- Fixer is built on a transformer architecture, pre-trained using masking techniques, and fine-tuned on a large-scale dataset of valid/invalid molecular pairs that we constructed. Through comprehensive evaluations across diverse generative models, ChemFixer improved molecular validity while effectively preserving the chemical and biological distributional properties of the original outputs. This indicates that ChemFixer can recover molecules that could not be previously generated, thereby expanding the diversity of potential drug candidates. Furthermore, ChemFixer was effectively applied to a drug-target interaction (DTI) prediction task using limited data, improving the validity of generated ligands and discovering promising ligand-protein pairs. These results suggest that ChemFixer is not only effective in data-limited scenarios, but also extensible to a wide range of downstream tasks. Taken together, ChemFixer shows promise as a practical tool for various stages of deep learning-based drug discovery, enhancing molecular validity and expanding accessible chemical space.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom