Sparse Coding for Drum Sound Classification and its Use as a Similarity Measure

Scholler, Simon; Purwins, P.

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Sparse Coding for Drum Sound Classification and its Use as a Similarity Measure

Title	Sparse Coding for Drum Sound Classification and its Use as a Similarity Measure
Publication Type	Conference Paper
Year of Publication	2010
Conference Name	3rd International Workshop on Machine Learning and Music (MML’10) at ACM Multimedia 2010
Authors	Scholler, S. , & Purwins H.
Abstract	Although rare in the sound recognition literature, previous work using features derived from a sparse temporal representation has led to some success. A great advantage of deriving features from a temporal representation is that such an approach does not face the trade-off problem between time and frequency resolution. Here, we present a biologically inspired two-step process for audio classification: In the first step, efficient basis functions are learned in an unsupervised manner [13] on mixtures of percussion sounds (drum phrases). In the second step, features are extracted by using the learned basis functions to decompose percussion sounds (base drum, snare drum, hi-hat) with matching pursuit. The classification accuracy in a 3-class database transfer task is 91.5% as op- posed to 70.7% when using MFCC features. Further, we show that a MP-feature representation preserves sound similarity to a greater extent than MFCC-features, i.e. an artificial mixture of two sounds of equal energy normally lies in the middle between the two single sound distributions in feature space. An MP-representation thus inherently contains a similarity measure between different sounds.
preprint/postprint document	http://www.mtg.upf.edu/files/publications/acm_ml_sparse.pdf

acm_ml_sparse.pdf