
Comparison of Clustering Algorithms on Air Quality Substances in Peninsular Malaysia
Author(s) -
Sitti Sufiah Atirah Rosly,
Balkiah Moktar,
Muhamad Hasbullah Mohd Razali
Publication year - 2018
Publication title -
journal of computing research and innovation
Language(s) - English
Resource type - Journals
ISSN - 2600-8793
DOI - 10.24191/jcrinn.v2i1.28
Subject(s) - cluster analysis , air quality index , cluster (spacecraft) , computer science , data mining , algorithm , air pollution , quality (philosophy) , environmental science , meteorology , geography , machine learning , chemistry , philosophy , organic chemistry , epistemology , programming language
Air quality is one of the most popular environmental problems in this globalization era. Airpollution is the poisonous air that comes from car emissions, smog, open burning, chemicalsfrom factories and other particles and gases. Thisharmful air can give adverse effects tohuman health and the environment. In order to provide information which areas are better forthe residents in Malaysia, cluster analysis is used to determine the areas that can be clusteringtogether based on their air quality through several air quality substances. Monthly data from37 monitoring stations in Peninsular Malaysia from the year 2013 to 2015 were used in thisstudy. K-Means (KM) clustering algorithm, Expectation Maximization (EM) clusteringalgorithm andDensity Based (DB) clustering algorithm have been chosen as the techniques toanalyze the cluster analysis by utilizing the Waikato Environment for Knowledge Analysis(WEKA) tools. Results show that K-means clustering algorithm is the best method among otheralgorithms due to its simplicity and time taken to build the model. The output of K-meansclustering algorithm shows that it can cluster the area into two clusters, namely as cluster 0and cluster 1. Clusters 0 consist of 16 monitoring stations and cluster 1 consists of 36monitoring stations in Peninsular Malaysia.