A simple clustering of the BioModels database using semanticSBML
Author(s) -
Falko Krause,
Wolfram Liebermeister
Publication year - 2009
Publication title -
nature precedings
Language(s) - English
Resource type - Journals
ISSN - 1756-0357
DOI - 10.1038/npre.2009.3444.1
Subject(s) - computer science , cluster analysis , sbml , python (programming language) , identifier , annotation , data mining , row , set (abstract data type) , artificial intelligence , programming language , xml , markup language , operating system
The BioModels database contains biochemical network models in SBML format, in which the biochemical meaning of elements is specified by MIRIAM-compliant RDF annotations. We used these annotations to define a similarity measure for models, scoring the overlap of the biochemical systems described. Based on this score, we used two-way clustering to detect groups of similar models and groups of co-occuring model elements. To recognize and compare biochemical elements, we used routines from the software semanticSBML. A Python script extracts all MIRIAM annotations (regardless of their qualifiers) using the semanticSBML annotation classes. The result is a matrix in which the rows represent the models (e.g. BioModel 001), while the columns represent specific annotations (e.g. urn:miriam:reactome:REACT_15422). A matrix element is set to 1 if an identifier occurs in a model and to 0 otherwise. This matrix was used as an input for a hierarchical clustering algorithm (implemented in Matlab) and the clustered matrix was visualized. Model clustering allows to detect models describing similar biochemical processes (e.g. glycolysis) and their specific common elements. This may help to find candidate models for completing a given initial model, which could then be merged using semanticSBML
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom