TY - GEN
T1 - A clustering ensemble method for clustering mixed data
AU - Al-Shaqsi, Jamil
AU - Wang, Wenjia
PY - 2010
Y1 - 2010
N2 - This paper presents a clustering ensemble method based on our novel three-staged clustering algorithm. A clustering ensemble is a paradigm that seeks to best combine the outputs of several clustering algorithms with a decision fusion function to achieve a more accurate and stable final output. Our ensemble is constructed with our proposed clustering algorithm as a core modelling method that is used to generate a series of clustering results with different conditions for a given dataset. Then, a decision aggregation mechanism such as voting is employed to find a combined partition of the different clusters. The voting mechanism considered only experimental results that produce intra-similarity value higher than the average intra-similarity value for a particular interval. The aim of this procedure is to find a clustering result that minimizes the number of disagreements between different clustering results. Our ensemble method has been tested on 11 benchmark datasets and compared with some individual methods including TwoStep, k-means, squeezer, k-prototype and some ensemble based methods including k-ANMI, ccdByEnsemble, SIPR, and SICM. The experimental results showed its strengths over the compared clustering algorithms.
AB - This paper presents a clustering ensemble method based on our novel three-staged clustering algorithm. A clustering ensemble is a paradigm that seeks to best combine the outputs of several clustering algorithms with a decision fusion function to achieve a more accurate and stable final output. Our ensemble is constructed with our proposed clustering algorithm as a core modelling method that is used to generate a series of clustering results with different conditions for a given dataset. Then, a decision aggregation mechanism such as voting is employed to find a combined partition of the different clusters. The voting mechanism considered only experimental results that produce intra-similarity value higher than the average intra-similarity value for a particular interval. The aim of this procedure is to find a clustering result that minimizes the number of disagreements between different clustering results. Our ensemble method has been tested on 11 benchmark datasets and compared with some individual methods including TwoStep, k-means, squeezer, k-prototype and some ensemble based methods including k-ANMI, ccdByEnsemble, SIPR, and SICM. The experimental results showed its strengths over the compared clustering algorithms.
UR - http://www.scopus.com/inward/record.url?scp=79959461219&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959461219&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2010.5596684
DO - 10.1109/IJCNN.2010.5596684
M3 - Conference contribution
AN - SCOPUS:79959461219
SN - 9781424469178
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2010 IEEE World Congress on Computational Intelligence, WCCI 2010 - 2010 International Joint Conference on Neural Networks, IJCNN 2010
T2 - 2010 6th IEEE World Congress on Computational Intelligence, WCCI 2010 - 2010 International Joint Conference on Neural Networks, IJCNN 2010
Y2 - 18 July 2010 through 23 July 2010
ER -