Total ancestry measure: quantifying the similarity in tree-like classification, with genomic applications
Author(s) -
Haiyuan Yu,
Ronald Jansen,
Gustavo Stolovitzky,
Mark Gerstein
Publication year - 2007
Publication title -
computer applications in the biosciences
Language(s) - English
Resource type - Journals
eISSN - 1460-2059
pISSN - 0266-7061
DOI - 10.1093/bioinformatics/btm291
Subject(s) - measure (data warehouse) , similarity (geometry) , similarity measure , tree (set theory) , artificial intelligence , pattern recognition (psychology) , mathematics , computer science , computational biology , biology , data mining , combinatorics , image (mathematics)
Many classifications of protein function such as Gene Ontology (GO) are organized in directed acyclic graph (DAG) structures. In these classifications, the proteins are terminal leaf nodes; the categories 'above' them are functional annotations at various levels of specialization and the computation of a numerical measure of relatedness between two arbitrary proteins is an important proteomics problem. Moreover, analogous problems are important in other contexts in large-scale information organization--e.g. the Wikipedia online encyclopedia and the Yahoo and DMOZ web page classification schemes.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom