Predictive Modeling of Multi-Label Enzyme Substrate using Machine Learning
Author(s) -
Sana Tariq,
Muhammad Umer Sarwar,
Muhammad Kashif Hanif,
Muhammad Irfan Khan
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3615796
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Enzymes play a critical role in catalyzing biochemical reactions, and accurately predicting their substrate specificity is essential for applications in drug discovery, metabolic engineering, and biotechnology. However, traditional experimental methods are time-consuming and resource-intensive, while existing computational approaches often rely on predicting Enzyme Commission (EC) numbers as coarse proxies for substrate specificity, failing to capture the many-to-many relationships between enzymes and their substrates. In this study, we present a hybrid deep learning framework for prediction of enzyme-substrate interactions. The proposed model integrates ProtT5 protein language model embeddings with extended connectivity fingerprints (ECFP4) and 3D molecular descriptors through a dedicated geometric attention mechanism that dynamically weights features based on interaction patterns. The key findings demonstrate that proposed approach achieves state-of-the-art performance, with an overall accuracy of 88.2% and F1-score of 0.823 on the BRENDA benchmark dataset. The model shows particular strength in predicting interactions for promiscuous enzymes (18.7% of the dataset), improving recall by 22% compared to EC-based prediction methods. The hybrid attention mechanism provides a 9.2% AUC improvement over static feature concatenation. The proposed methodology can be extended to other multi-label classification problems in biological systems, paving the way for future advancements in computational biology.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom