Additional Evidence That Common Low-level Features Of Individual Audio Frames Are Not Representative Of Music Genre

TitleAdditional Evidence That Common Low-level Features Of Individual Audio Frames Are Not Representative Of Music Genre
Publication TypeConference Paper
Year of Publication2010
Conference Name7th Sound and Music Computing Conference, 2010, Barcelona, Spain
AuthorsMarques, G., Lopes M., Sordo M., Langlois T., & Gouyon F.
Conference LocationBarcelona
Keywordsaudio frames, codebook, hmm, low-level, music genre, svm
Abstract
The Bag-of-Frames (BoF) approach has been widely
used in music genre classification. In this approach, music
genres are represented by statistical models of low-level
features computed on short frames (e.g. in the tenth of ms)
of audio signal. In the design of such models, a common
procedure in BoF approaches is to represent each music
genre by sets of instances (i.e. frame-based feature vectors)
inferred from training data. The common underlying
assumption is that the majority of such instances do capture
somehow the (musical) specificities of each genre, and
that obtaining good classification performance is a matter
of size of the training dataset, and fine-tuning feature extraction
and learning algorithm parameters.
We report on extensive tests on two music databases that
contradict this assumption. We show that there is little or
no benefit in seeking a thorough representation of the feature
vectors for each class. In particular, we show that
genre classification performances are similar when representing
music pieces from a number of different genres
with the same set of symbols derived from a single genre
or from all the genres. We conclude that our experiments
provide additional evidence to the hypothesis that common
low-level features of isolated audio frames are not representative
of music genres.
preprint/postprint documentfiles/publications/20.pdf
intranet