z-logo
open-access-imgOpen Access
Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions
Author(s) -
Grímur Hjörleifsson Eldjárn,
Andrew Ramsay,
Justin J. J. van der Hooft,
Katherine Duncan,
Sylvia Soldatou,
Juho Rousu,
Rónán Daly,
Joe Wandy,
Simon Rogers
Publication year - 2021
Publication title -
plos computational biology/plos computational biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.628
H-Index - 182
eISSN - 1553-7358
pISSN - 1553-734X
DOI - 10.1371/journal.pcbi.1008920
Subject(s) - bottleneck , computer science , ranking (information retrieval) , data mining , metabolomics , computational biology , genomics , task (project management) , software , association rule learning , machine learning , bioinformatics , biology , genome , gene , genetics , engineering , programming language , systems engineering , embedded system
Specialised metabolites from microbial sources are well-known for their wide range of biomedical applications, particularly as antibiotics. When mining paired genomic and metabolomic data sets for novel specialised metabolites, establishing links between Biosynthetic Gene Clusters (BGCs) and metabolites represents a promising way of finding such novel chemistry. However, due to the lack of detailed biosynthetic knowledge for the majority of predicted BGCs, and the large number of possible combinations, this is not a simple task. This problem is becoming ever more pressing with the increased availability of paired omics data sets. Current tools are not effective at identifying valid links automatically, and manual verification is a considerable bottleneck in natural product research. We demonstrate that using multiple link-scoring functions together makes it easier to prioritise true links relative to others. Based on standardising a commonly used score, we introduce a new, more effective score, and introduce a novel score using an Input-Output Kernel Regression approach. Finally, we present NPLinker, a software framework to link genomic and metabolomic data. Results are verified using publicly available data sets that include validated links.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here