z-logo
Premium
Iterative Shannon Entropy – a Methodology to Quantify the Information Content of Value Range Dependent Data Distributions. Application to Descriptor and Compound Selectivity Profiling
Author(s) -
Wassermann Anne Mai,
Vogt Martin,
Bajorath Jürgen
Publication year - 2010
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201000029
Subject(s) - entropy (arrow of time) , selectivity , profiling (computer programming) , information theory , joint entropy , computer science , data mining , biological system , mathematics , principle of maximum entropy , statistics , pattern recognition (psychology) , artificial intelligence , chemistry , thermodynamics , biology , physics , biochemistry , operating system , catalysis
We introduce an entropy‐based methodology, Iterative Shannon entropy ( ISE ), to quantify the information contained in molecular descriptors and compound selectivity data sets taking data spread directly into account. The method is applicable to determine the information content of any value range dependent data distribution. An analysis of descriptor information content has been carried out to explore alternative binning schemes for entropy calculation. Using this entropic measure we have profiled 153 compound selectivity data sets for combinations of 68 target proteins belonging to 10 target families. With the ISE measure, we aim to assign high information content to compound data sets that span a wide range of selectivity values and different selectivity relationships and hence correspond to more than one biological phenotype. Target families with high average entropy scores are identified. For members of these families, active compounds display highly differentiated selectivity profiles.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here