Using R Packages for Comparison of Cluster Stability

Authors

  • Dorota Rozmus University of Economics in Katowice, Faculty of Finance and Insurance, Department of Economic and Financial Analysis

DOI:

https://doi.org/10.18778/0208-6018.330.05

Keywords:

clustering, taxonomy, stability

Abstract

The stability of clustering methods is the issue that has attracted a considerable amount of attention of researchers in recent years. In this respect, the major question that needs to be answered seems to be to what extent the structure discovered by a particular method is actually present in the data. The literature proposes a number of different ways of measuring stability. The theoretical considerations have led to the development of computer tools for the practical implementation of the proposed ways to study stability. The practical tools are available within several R packages, for example, clv, clValid, fpc, ClusterStability, and pvclust. Due to the hypothesis that cluster stability can be the answer to the question about the right number of groups in clustering, the main aim of this article is to compare the results of the studies on clustering stability conducted with three R packages, i.e.: clv, clValid, and fpc.

Downloads

Download data is not yet available.

References

Ben‑Hur A., Guyon I . (2003), Detecting Stable Clusters Using Principal Component Analysis, “Methods in Molecular Biology”, vol. 224, pp. 59–182.
Google Scholar

Brock G., Pihur V., Datta S., Datta S. (2011), clValid: An R Package for Cluster Validation, http://cran.us.r‑project.org/web/packages/clValid/vignettes/clValid.pdf.
Google Scholar

Fang Y., Wang J. (2012), Selection of the Number of Clusters via the Bootstrap Method, “Computational Statistics and Data Analysis”, vol. 56, pp. 468–477.
Google Scholar

Granichin O., Volkovich Z., Toledano‑Kitai D. (2015), Cluster Validation, “Randomized Algorithms in Automatic Control and Data Mining”, vol. 67, pp. 163–228.
Google Scholar

Hosein A., Behrouz M., Hamid P., Mohsen M. (2011), An Asymmetric Criterion for Cluster Validation, “Developing Concepts in Applied Intelligence”, Studies in Computational Intelligence”, vol. 363, pp. 1–14.
Google Scholar

Koepke H., Clarke B. (2013), A Bayesian Criterion for Cluster Stability, “Statistical Analysis and Data Mining: The ASA Data Science Journal”, vol. 6, issue 4, pp. 346–374.
Google Scholar

Ryazanov V. (2016), About Estimation of Quality of Clustering Results via Its Stability, “Intelligent Data Analysis”, vol. 20(1), pp. 5–15.
Google Scholar

Shamir O., Tishby N. (2008), Cluster Stability for Finite Samples, “Advances in Neural Information Processing Systems”, vol. 20, pp. 1297–1304.
Google Scholar

Volkovich Z., Barzily Z., Toledano‑Kitai D., Avros R. (2010), The Hotteling’s Metric as a Cluster Stability Measure, “Computer Modelling and New Technologies”, vol. 14, no. 4, pp. 65–72.
Google Scholar

Wang J. (2010), Consistent Selection of the Number of Clusters via Cross‑validation, “Biometrika”, vol. 97, pp. 893–904.
Google Scholar

Downloads

Published

2017-11-15

How to Cite

Rozmus, D. (2017). Using R Packages for Comparison of Cluster Stability. Acta Universitatis Lodziensis. Folia Oeconomica, 4(330), [77]–86. https://doi.org/10.18778/0208-6018.330.05

Issue

Section

Articles