
Improving penalized regression-based clustering model in big data
Author(s) -
Sarah Ghanim Mahmood Al-kababchee,
Omar Saber Qasim,
Zakariya Yahya Algamal
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1897/1/012036
Subject(s) - cluster analysis , computer science , data mining , regression , correlation clustering , artificial intelligence , cure data clustering algorithm , regression analysis , consensus clustering , machine learning , pattern recognition (psychology) , mathematics , statistics
Clustering is the main procedure for data mining with a wide application such as gene analysis. Clustering is a method of separates (grouping) previously unclassified data on the basis of its features, and it is an unsupervised learning problem that divides that data into groups in such a way that it makes those data in the same group more similar to each other compared to in other groups. Penalized regression-based clustering is an extension of the “Sum Of Norms” clustering model. In this paper, the nature-inspired algorithm is employed to improve the penalized regression-based clustering to better estimation. The real data application on gene expression data results suggests that our proposed improvement can bring significant improvement relative to others.