Sparse Regression in Time-Frequency Representations of Complex Audio

Monika Doerfler, Gino Velasco, Arthur Flexer, Volkmar Klien

Time-frequency representations are commonly used tools for the representation of audio and in particular music signals. From a theoretical point of view, these representations are linked to Gabor frames. Frame theory yields a convenient reconstruction method making post-processing unnecessary. Furthermore, using dual or tight frames in the reconstruction, we may resynthesize localized components from so-called sparse representation coefficients. Sparsity of coefficients is directly reinforced by the application of a ℓ1-penalization term on the coefficients. We introduce an iterative algorithm leading to sparse coefficients and demonstrate the effect of using these coefficients in several examples. In particular, we are interested in the ability of a sparsity promoting approach to the task of separating components with overlapping analysis coefficients in the time-frequency domain. We also apply our approach to the problem of auditory scene description, i.e. source identification in a complex audio mixture.

Keywords: Signal Processing, Audio, Sparsity, Annotation

Citation: Doerfler M., Velasco G., Flexer A., Klien V.: Sparse Regression in Time-Frequency Representations of Complex Audio. Proceedings of the 7th Sound and Music Computing Conference (SMC'10), Barcelona, Spain, 2010.