z-logo
Premium
A Non‐metric Procedure for Transforming Dissimilarities to Euclidean Distances Useful in Numerical Taxonomy and Ecology
Author(s) -
Lefkovitch L. P.
Publication year - 1989
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.4710310505
Subject(s) - mathematics , euclidean geometry , euclidean distance , distance matrices in phylogeny , principal component analysis , graph , multidimensional scaling , combinatorics , curse of dimensionality , metric (unit) , hierarchical clustering , similarity (geometry) , set (abstract data type) , pattern recognition (psychology) , statistics , artificial intelligence , cluster analysis , computer science , geometry , image (mathematics) , operations management , economics , programming language
Because (a) in agglomerative cluster analysis, it is the smallest dissimilarities which are used, (b) in numerical ecology, little or no information is given about the resemblance between two localities if there are very few (or no) species in common, (c) for a given dissimilarity, the number of ways two objects can differ in the states shown by their attributes is small for small dissimilarities, and disproportionately larger for larger dissimilarities, (d) the major axes in principal coordinates analysis are determined by the largest dissimilarities, which from (a–c) are the least informative, and (e) the ‘orsesho’ pattern often shown by two‐dimensional scalings of dissimilarities, it is concluded that the largest dissimilarities should be replaced by values determined from the smallest. The proposal is to retain the dissimilarities corresponding with the edges in the relative neighbourhood graph, and to replace the remaining by the shortest paths on the dissimilarity‐weighted graph. The new dissimilarities, which are linearly related with the retained subset, are shown to be Euclidean, and usually the effective dimensionality is reduced in comparison with the original set. The internal disposition of distinct subsets of objects is little affected, but any distinct subsets tend to be further separated. By contrast, if the smallest dissimilarities represent random differences, while the larger subset represents those which are systematic, it is the latter which should be retained. The notion of the relative external graph is introduced, and the procedures required to obtain this, and to replace the deleted empirical smallest dissimilarities briefly described. However, this results in an increase in the effective dimension, changes in the internal disposition of distinct subsets, and reduction in their separation.–Two numerical examples, based on empirical data, illustrate some of the consequences of the transformations.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here