z-logo
Premium
Quality Issues with Public Domain Chemogenomics Data
Author(s) -
Kalliokoski Tuomo,
Kramer Christian,
Vulpetti Anna
Publication year - 2013
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201300051
Subject(s) - computer science , public domain , key (lock) , domain (mathematical analysis) , data science , data quality , quality (philosophy) , data mining , computer security , engineering , mathematical analysis , metric (unit) , philosophy , operations management , theology , mathematics , epistemology
Abstract The key concept in chemogenomics is the similarity principle that states that similar ligands should bind similar targets. Chemogenomic analysis requires large amounts of data and both powerful computational algorithms and computers. Data used for chemogenomics analysis can either be compiled from open sources, or they can be produced in‐house as is often done in the pharmaceutical industry. The chemogenomic modeller often has to resort to mixing activity values from different laboratories and even assay types to facilitate chemogenomic analysis. The amount of chemogenomics data available in the public domain has dramatically increased in recent years, allowing fully traceable analysis on a continuously increasing scale. However, some warning flags about the data quality have been raised and because the primary data determine the accuracy of chemogenomic analysis, the quality of the data is one of the key questions in chemogenomics. This mini‐review discusses some of the most common issues with public domain biological data related to chemogenomic analysis. The errors in data can originate from problems with the experiments themselves and their interpretation, or from more mundane issues such as data extraction and annotation. These issues are not unique for a certain database but are shared by all the public domain databases and can plague commercial and in‐house bioactivity databases as well.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here