OFAI

Technical Reports - Query Results

Your query term was 'number = 2015-01'
1 report found
OFAI-TR-2015-01 ( 398kB PDF file)

The Unbalancing Effect of Hubs on K-medoids Clustering in High-Dimensional Spaces

Dominik Schnitzer, Arthur Flexer

Unbalanced cluster solutions are affected by very different cluster sizes, with some clusters being very large while others contain almost no data. We demonstrate that this phenomenon is connected to `hubness', a recently discovered general problem of machine learning in high dimensional data spaces. Hub objects have a small distance to an exceptionally large number of data points, and anti-hubs are far from all other data points. In an empirical study of K-medoids clustering we show that hubness gives rise to very unbalanced cluster sizes resulting in impaired internal and external evaluation indices. We compare three methods which reduce hubness in the distance spaces and show that with the balancing of the clusters evaluation indices improve. This is done using artificial and real data sets from diverse domains.

Keywords: Clustering, Hubness, Curse of dimensionality

Citation: Schnitzer, Dominik and Flexer, Arthur. The Unbalancing Effect of Hubs on K-medoids Clustering in High-Dimensional Spaces. In Proceedings of the International Joint Conference on Neural Networks, 2015.