Unsupervised Classification of Chemical Compounds | Zendy

Guttiérrez Toscano P. | Zendy; Marriott F. H. C. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Unsupervised Classification of Chemical Compounds

Author(s) -

Guttiérrez Toscano P.,

Marriott F. H. C.

Publication year - 1999

Publication title -

journal of the royal statistical society: series c (applied statistics)

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.205

H-Index - 72

eISSN - 1467-9876

pISSN - 0035-9254

DOI - 10.1111/1467-9876.00146

Subject(s) - multidimensional scaling , cluster analysis , computer science , fingerprint (computing) , data mining , pattern recognition (psychology) , scaling , binary data , binary number , data set , metric (unit) , cluster (spacecraft) , metric space , set (abstract data type) , coding (social sciences) , artificial intelligence , mathematics , machine learning , statistics , discrete mathematics , engineering , operations management , geometry , arithmetic , programming language

Clustering chemical compounds of similar structure is important in the pharmaceutical industry. One way of describing the structure is the chemical `fingerprint'. The fingerprint is a string of binary digits, and typical data sets consist of very large numbers of fingerprints; a suitable clustering procedure must take account of the properties of this method of coding, and must be able to handle large data sets. This paper describes the analysis of a set of fingerprint data. The analysis was based on an appropriate distance measure derived from the fingerprints, followed by metric scaling into a low‐dimensional space. An approximation to metric scaling, suitable for very large data sets, was investigated. Cluster analysis using two programs, mclust and AutoClass‐C, was carried out on the scaled data.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research