Balanced clustering

Summary

Balanced clustering is a special case of clustering where, in the strictest sense, cluster sizes are constrained to or , where is the number of points and is the number of clusters.[1] A typical algorithm is balanced k-means, which minimizes mean square error (MSE). Another type of balanced clustering called balance-driven clustering has a two-objective cost function that minimizes both the imbalance and the MSE. Typical cost functions are ratio cut[2] and Ncut.[3] Balanced clustering can be used for example in scenarios where freight has to be delivered to locations with cars. It is then preferred that each car delivers to an equal number of locations.

Software edit

There exists implementations for balanced k-means[4] and Ncut[5]

References edit

  1. ^ M. I. Malinen and P. Fränti (August 2014). "Balanced K-Means for Clustering". Structural, Syntactic, and Statistical Pattern Recognition. Lecture Notes in Computer Science. Vol. 8621. pp. 32–41. doi:10.1007/978-3-662-44415-3_4. ISBN 978-3-662-44414-6.
  2. ^ L. Hagen and A. B. Kahng (1992). "New spectral methods for ratio cut partitioning and clustering". IEEE Transactions on Computer-Aided Design. 11 (9): 1074–1085. doi:10.1109/43.159993.
  3. ^ J. Shi and J. Malik (2000). "Normalized cuts and image segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 22 (8): 888–905. doi:10.1109/34.868688.
  4. ^ M. I. Malinen and P. Fränti. "Balanced k-Means implementation". University of Eastern Finland.
  5. ^ T. Cour, S. Yu and J. Shi. "Ncut implementation". University of Pennsylvania.

Levin, M. Sh. (2017). "On Balanced Clustering (Indices, Models, Examples)". Journal of Communications Technology and Electronics. 62 (12): 1506–1515. doi:10.1134/S1064226917120105. S2CID 255277095.