Pitfalls of ascertainment biases in genome annotations—computing comparable protein domain distributions in eukarya | Zendy

Arli Aditya Parikesit | Zendy; Lydia Steiner | Zendy; Peter F. Stadler | Zendy; Sonja J. Prohaska | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Pitfalls of ascertainment biases in genome annotations—computing comparable protein domain distributions in eukarya

Author(s) -

Arli Aditya Parikesit,

Lydia Steiner,

Peter F. Stadler,

Sonja J. Prohaska

Publication year - 2014

Publication title -

malaysian journal of fundamental and applied sciences

Language(s) - English

Resource type - Journals

ISSN - 2289-599X

DOI - 10.11113/mjfas.v10n2.57

Subject(s) - normalization (sociology) , annotation , computer science , genome , domain (mathematical analysis) , correlation , computational biology , gene , biology , genetics , mathematics , artificial intelligence , mathematical analysis , sociology , anthropology , geometry

Most investigations into the large-scale patterns of protein evolution are based on gene annotations that have been compiled in reference databases. The use of these resources for quantitative comparisons, however, is complicated by sometimes vast differences in coverage. More importantly, however, we also observe substantial ascertainment biases that cannot be removed by simple normalization procedures. A striking example is provided by the correlations between protein domains. We observe that statistics derived from different computational gene annotation procedure show dramatic discrepancies, and even qualitative changes from negative to positive correlation, when compared to statistics obtained from annotation databases.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research