End-to-end learning for music audio tagging at scale

Pons, Jordi; Oriol Nieto; Matthew Prockup; Erik M. Schmidt; Andreas F. Ehmann; Xavier Serra

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

End-to-end learning for music audio tagging at scale

Title	End-to-end learning for music audio tagging at scale
Publication Type	Conference Paper
Year of Publication	2018
Conference Name	19th International Society for Music Information Retrieval Conference (ISMIR2018)
Authors	Pons, J. , Nieto O. , Prockup M. , Schmidt E. M. , Ehmann A. F. , & Serra X.
Conference Location	Paris
Abstract	The lack of data tends to limit the outcomes of deep learning research, particularly when dealing with end-to-end learning stacks processing raw data such as waveforms. In this study, 1.2M tracks annotated with musical labels are available to train our end-to-end models. This large amount of data allows us to unrestrictedly explore two different design paradigms for music auto-tagging: assumption-free models - using waveforms as input with very small convolutional filters; and models that rely on domain knowledge - log-mel spectrograms with a convolutional neural network designed to learn timbral and temporal features. Our work focuses on studying how these two types of deep architectures perform when datasets of variable size are available for training: the MagnaTagATune (25k songs), the Million Song Dataset (240k songs), and a private dataset of 1.2M songs. Our experiments suggest that music domain assumptions are relevant when not enough training data are available, thus showing how waveform-based models outperform spectrogram-based ones in large-scale data scenarios.
preprint/postprint document	https://arxiv.org/abs/1711.02520

Additional material:

Work awarded as 'Best student paper award'.

- Code: https://github.com/jordipons/music-audio-tagging-at-scale-models
- Demo: http://www.jordipons.me/apps/music-audio-tagging-at-scale-demo/