z-logo
Premium
Approximate clustering in very large relational data
Author(s) -
Bezdek James C.,
Hathaway Richard J.,
Huband Jacalyn M.,
Leckie Christopher,
Kotagiri Ramamohanarao
Publication year - 2006
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.20162
Subject(s) - cluster analysis , remainder , literal (mathematical logic) , extension (predicate logic) , mathematics , relational database , fuzzy clustering , relation (database) , data mining , correlation clustering , object (grammar) , computer science , sampling (signal processing) , set (abstract data type) , pattern recognition (psychology) , algorithm , artificial intelligence , arithmetic , computer vision , filter (signal processing) , programming language
Different extensions of fuzzy c‐means (FCM) clustering have been developed to approximate FCM clustering in very large (unloadable) image (eFFCM) and object vector (geFFCM) data. Both extensions share three phases: (1) progressive sampling of the VL data, terminated when a sample passes a statistical goodness of fit test; (2) clustering with (literal or exact) FCM; and (3) noniterative extension of the literal clusters to the remainder of the data set. This article presents a comparable method for the remaining case of interest, namely, clustering in VL relational data. We will propose and discuss each of the four phases of eNERF and our algorithm for this last case: (1) finding distinguished features that monitor progressive sampling, (2) progressively sampling a square N × N relation matrix R N until an n × n sample relation R n passes a statistical test, (3) clustering R n with literal non‐Euclidean relational fuzzy c‐means, and (4) extending the clusters in R n to the remainder of the relational data. The extension phase in this third case is not as straightforward as it was in the image and object data cases, but our numerical examples suggest that eNERF has the same approximation qualities that eFFCM and geFFCM do. © 2006 Wiley Periodicals, Inc. Int J Int Syst 21: 817–841, 2006.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here