Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Designing Efficient Architectures for Modeling Temporal Features with Convolutional Neural Networks

Title Designing Efficient Architectures for Modeling Temporal Features with Convolutional Neural Networks
Publication Type Conference Paper
Year of Publication 2017
Conference Name 42th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2017)
Authors Pons, J. , & Serra X.
Conference Start Date 05/03/2017
Publisher IEEE
Conference Location New Orleans, USA.
Abstract

Many researchers use convolutional neural networks with small rectangular filters for music (spectrograms) classification. First, we discuss why there is no reason to use this filters setup by default and second, we point that more efficient architectures could be implemented if the characteristics of the music features are considered during the design process. Specifically, we propose a novel design strategy that might promote more expressive and intuitive deep learning architectures by efficiently exploiting the representational capacity of the first layer using different filter shapes adapted to fit musical concepts within the first layer. The proposed architectures are assessed by measuring their accuracy in predicting the classes of the Ballroom dataset. We also make available the used code (together with the audio-data) so that this research is fully reproducible.

preprint/postprint document https://zenodo.org/record/437964
Final publication https://doi.org/10.1109/ICASSP.2017.7952601
Additional material: