Premium
Comparison of sequences as a method for evaluation of the molecular similarity
Author(s) -
JermanBlaẑiĉ B.,
Fabiĉ I.,
Randić M.
Publication year - 1986
Publication title -
journal of computational chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.907
H-Index - 188
eISSN - 1096-987X
pISSN - 0192-8651
DOI - 10.1002/jcc.540070211
Subject(s) - similarity (geometry) , correlation , set (abstract data type) , sequence (biology) , group (periodic table) , path (computing) , mathematics , data set , base (topology) , similitude , data mining , computer science , statistics , chemistry , artificial intelligence , mathematical analysis , geometry , biochemistry , organic chemistry , image (mathematics) , programming language
String comparison techniques were developed and applied for measuring the molecular similarity of chemical structures. The molecular structures were encoded as a sequence of numbers representing counts of paths of different lengths. The similarity index between two compounds was calculated as the difference between the gains of information derived through comparison of the corresponding molecular path sequences. Ranks between the structures of the studied data base obtained according to this similarity were used as basic data for deriving correspondences between the elements of the set of compounds. The method was applied on a group of 41 barbiturates. Correlation equations were calculated for different groups of compounds grouped according to the displayed similarity. The correlation equations and the corresponding statistics were obtained using standard computer programs. Special algorithm for computing the similarity index and the correlation matrix (outlined very briefly) was developed and implemented on VAX 11/750.