
Performance Analysis Similarity Matrix, Responsibility Matrix, Availability Matrix, Criterion Matrix of Affinity Propagation
Author(s) -
Hanna Willa Dhany
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1898/1/012043
Subject(s) - cluster analysis , similarity (geometry) , matrix (chemical analysis) , data mining , affinity propagation , cluster (spacecraft) , computer science , hierarchical clustering , matlab , process (computing) , function (biology) , data matrix , algorithm , mathematics , fuzzy clustering , cure data clustering algorithm , artificial intelligence , clade , biochemistry , chemistry , materials science , image (mathematics) , evolutionary biology , phylogenetic tree , gene , composite material , biology , programming language , operating system
Each process in classifying data into several clusters or grouping so that the data in one cluster has a maximum similarity level and between clusters has a minimum similarity is called clustering. Clustering is divided into 2 approaches in its development, namely the partitioning and hierarchical approach to clustering[1]. The Water Quality Status dataset has 8 attributes, 4 classes and 120 instances, Class distribution is good condition (30 instances), lightly polluted (30 instances), medium polluted (30 instances) and heavily polluted (30 instances). 70% of the data will be used as training data and 30% of the data will be used as randomized test data. The simplify the process of completing the performance calculation of the clustering model, the research implementation was carried out using the MATLAB function. That iteration is carried out with the number of clusters generated from 100 to 2,500 iterations with the results of the number of clusters as many as 10 clusters. In the experiment, iteration amounted to 5000 and there was a change in the results of the number of clusters by 9 clusters. After re-testing using the number of iterations of 10,000-50,000 iterations, but the number of clusters produced did not change anything at all. So that the conclusion in testing the AP method produces the most optimal number of clusters of 10 clusters.