z-logo
open-access-imgOpen Access
SUPERMAGOv2: Protein Function Prediction via Transformer Embeddings and Bitscore-Weighted Features
Author(s) -
Gabriel Bianchin De Oliveira,
Helio Pedrini,
Zai Dias
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3596851
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Sequencing technologies have advanced considerably in recent years, leading to the sequencing of a vast number of proteins through laboratory methods. However, the functional annotation of these proteins has not kept pace with sequencing efforts, creating a significant gap between sequenced proteins and those with known functions. To address this challenge, computational approaches based solely on amino acid sequence features have been developed to improve functional predictions. In this study, we introduce two novel approaches, one based on machine learning and another using an ensemble of machine learning with local alignment. Our machine learning-based model (SUPERMAGOv2) utilizes transformer-based backbones to extract features from multiple layers, which are then processed by six multilayer perceptrons that incorporate a novel bitscore-weighted input derived from DIAMOND alignments, and by an image classification model that converts the extracted feature vectors into images. Furthermore, we present SUPERMAGOv2+, an ensemble model that combines SUPERMAGOv2 with enhanced DIAMOND-based predictions. In addition, we introduce SUPERMAGOv2+Web, a lightweight web server version of SUPERMAGOv2+. Both proposed methods consistently outperform state-of-the-art approaches across various analyses, establishing themselves as leading methodologies for protein function classification based on amino acid sequences.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom