Music Similarity and Recommendation
Over the course of several years, our group developed some of the best methods for estimating the perceived acoustic similarity of two music pieces in terms of instrumentation and rhythm. Furthermore, we have developed indexing methods to find the most similar pieces to a query song in multi-million track collections in fractions of a second.
What does it do?
A music similarity algorithm reads two audio files and spits out a number giving their acoustic distance: A distance of zero indicates two identical files, small distances indicate musically similar files, and large distances indicate dissimilar files. A common use case is to find a set of low-distance (high-similarity) files to a query file, to be used as personal recommendations or continuation of a playlist.
How does it work?
For each music piece, the algorithm computes an abstract model of its sound: a set of numbers capturing aspects of its timbral and rhythmic qualities. Comparing the abstract models of two music pieces gives their acoustic distance.
As the models are very compact, they can easily be precomputed and stored with a music collection. To handle commercial-scale collections, a two-step process first finds candidates for highly similar items to a query using efficient approximations of the models, then refines the selection down to the number of requested results.
Our methods have been developed over the course of several doctoral theses and define the state of the art, achieving top ranks in the “Audio Music Similarity” task of the MIREX evaluation campaign in 2009, 2010, 2011, 2012, 2013, 2014 (there were no contestants in 2015).
Please see the publications of Pohle et al. and Seyerlehner et al. for details on the model and distance computation, and Schnitzer et al. and Schlüter for details on large-scale applications.
What is it good for?
This technology can be used for sound-based music recommendation, intelligent playlist generation or catalogue browsing.
In contrast to approaches based on usage data (“other users listening to ABC also listen to XYZ”), our method can find songs in the tail end of a catalogue that nobody has ever purchased or listened to. Furthermore, as it does not require any metadata or annotations, it is applicable to any set of audio files including a user's personal music collection.
Examples: Our technology has been deployed as an automatic DJ in a high-end sound system by Bang & Olufsen, termed the MOTS feature (“more of the same”): When activated, the system continues to play music from a user's collection similar to what she started with. Furthermore, we have provided an interactive browser for the FM4 Soundpark, a living collection of songs of Austrian newcomer bands hosted by a public radio station based in Vienna. Please see our commercial showcases for additional information.
How can i get it?
In conjunction with his doctoral thesis at the OFAI, Dominik Schnitzer has published an open-source (MPL 2.0 licensed) library for audio music similarity computation called Musly. OFAI actively participates in the development process of the library.
The open source edition of Musly comes with two basic similarity measures that work reasonably well for simple purposes or prototyping, but do not take into account rhythmic information and are not fast enough for practically handling millions of music tracks. OFAI offers commercial plugins for Musly solving exactly that. As our plugins integrate smoothly with Musly, you can seamlessly upgrade from the basic similarity measures to one of our two plugins:
- PS implements our best similarity measure. Contrary to many other measures it includes a music timbre and rhythm component. Its indexed variant is able to handle very large music collections: We use it on a standard PC with 2.5 million songs.
- BLF1024 is a very fast music similarity measure. Its features require only 1024 bits storage per music piece. By using the hamming distance as a similarity measure it offers incredibly fast similarity computation while still offering very good music similarity quality.