Technical Reports - Query Results

Your query term was 'number = 2012-15'
1 report found
OFAI-TR-2012-15 ( 380kB PDF file)

Local and Global Scaling Reduce Hubs in Space

Dominik Schnitzer, Arthur Flexer, Markus Schedl, Gerhard Widmer

"Hubness" has recently been identified as a general problem of high dimensional data spaces, manifesting itself in the emergence of objects, so-called hubs, which tend to be among the k nearest neighbors of a large number of data items. As a consequence many nearest neighbor relations in the distance space are asymmetric, that is, object y is amongst the nearest neighbors of x but not vice versa. The work presented here discusses two classes of methods that try to symmetrize nearest neighbor relations and investigates to what extent they can mitigate the negative effects of hubs. We evaluate local distance scaling and propose a global variant which has the advantage of being easy to approximate for large datasets and of having a probabilistic interpretation. Both local and global approaches are shown to be effective especially for high-dimensional datasets, which are affected by high hubness. Both methods lead to a strong decrease of hubness in these datasets, while at the same time improving properties like classification accuracy. We evaluate the methods on a large number of public machine learning datasets and synthetic data. Finally we present a real-world application where we are able to achieve significantly higher retrieval quality.

Keywords: local and global scaling, shared near neighbors, hubness, classification, curse of dimensionality, nearest neighbor relation

Citation: Schnitzer D., Flexer A., Schedl M., Widmer G.: Local and Global Scaling Reduce Hubs in Space, Journal of Machine Learning Research, 13(Oct):2871-2902, 2012.