z-logo
open-access-imgOpen Access
Pitfalls of ascertainment biases in genome annotations—computing comparable protein domain distributions in eukarya
Author(s) -
Arli Aditya Parikesit,
Lydia Steiner,
Peter F. Stadler,
Sonja J. Prohaska
Publication year - 2014
Publication title -
malaysian journal of fundamental and applied sciences
Language(s) - English
Resource type - Journals
ISSN - 2289-599X
DOI - 10.11113/mjfas.v10n2.57
Subject(s) - normalization (sociology) , annotation , computer science , genome , domain (mathematical analysis) , correlation , computational biology , gene , biology , genetics , mathematics , artificial intelligence , mathematical analysis , sociology , anthropology , geometry
Most investigations into the large-scale patterns of protein evolution are based on gene annotations that have been compiled in reference databases. The use of these resources for quantitative comparisons, however, is complicated by sometimes vast differences in coverage. More importantly, however, we also observe substantial ascertainment biases that cannot be removed by simple normalization procedures. A striking example is provided by the correlations between protein domains. We observe that statistics derived from different computational gene annotation procedure show dramatic discrepancies, and even qualitative changes from negative to positive correlation, when compared to statistics obtained from annotation databases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom