Technical Reports - Query Results
Your query term was 'number = 2016-10'1 report found
- OFAI-TR-2016-10 (
116kB PDF file)
An Empirical Analysis of Hubness in Unsupervised Distance-Based Outlier Detection
- Arthur Flexer
- Outlier detection is the task of automatic identification of unknown data not covered by training data (e.g. a previously unknown class in classification). We explore outlier detection in the presence of hubs and anti-hubs, i.e. data objects which appear to be either very close or very far from most other data due to a problem of measuring distances in high dimensions. We compare a classic distance based method to two new approaches, which have been designed to counter the negative effects of hubness, on six high-dimensional data sets. We show that mainly anti-hubs pose a problem for outlier detection and that this can be improved by using a hubness-aware approach based on re-scaling the distance space.
Keywords: Outlier detection, Hubness, Curse of dimensionality, Evaluation
- Outlier detection is the task of automatic identification of unknown data not covered by training data (e.g. a previously unknown class in classification). We explore outlier detection in the presence of hubs and anti-hubs, i.e. data objects which appear to be either very close or very far from most other data due to a problem of measuring distances in high dimensions. We compare a classic distance based method to two new approaches, which have been designed to counter the negative effects of hubness, on six high-dimensional data sets. We show that mainly anti-hubs pose a problem for outlier detection and that this can be improved by using a hubness-aware approach based on re-scaling the distance space.