MIRages: an account of music audio extractors, semantic description and context-awareness, in the three ages of MIR

TitleMIRages: an account of music audio extractors, semantic description and context-awareness, in the three ages of MIR
Publication TypePhD Thesis
Year of Publication2018
UniversityUniversitat Pompeu Fabra
AuthorsHerrera-Boyer, P.
AdvisorSerra, X., & Gómez E.
Academic DepartmentInformation and Communication Technologies
Number of Pages241+XIII
Date Published12/2018
CityBarcelona, Spain
Keywordsaudio analysis, Audio features, Automatic classification of music, music analysis, Music creation systems, music information retrieval, music recommendation, Semantic features, Timbre

This tesis reports on research carried out and published during the last twenty years on different problems of Music Information Retrieval (MIR). We organize the text as a personal account and critical reflection along four hypothesized ages that have shaped the evolution of MIR. In the age of feature extractors, we present work on features to describe sounds and music, especially timbre and tonal aspects. In the age of semantic descriptors work on describing music with high-level concepts, such as mood, instruments, similarities, cover versions or genres, usually inferred with machine learning from annotated collections is reported. In the age of context-aware systems we report on user models for recommendation and for avatar generation, in addition to factors that influence music listening decisions. We finally discuss the possibility of a more recent age of creative systems where MIR features, classifiers, models and evaluation methodologies aid to enhance or expand music creation.

The thesis is a compendium of different papers published in peer-reviewed journals and conferences:

  • Herrera, P. & Bonada, J. (1998). Vibrato extraction and parameterization in the spectral modelling synthesis framework. Proceedings of the Digital Audio Effects Workshop (DAFX98), Barcelon, Spain.
  • Herrera, P., Yeterian, A., Gouyon, F. (2002). Automatic classification of drum sounds: a comparison of feature selection methods and classification techniques. In C. Anagnostopoulou et al. (Eds), "Music and Artificial Intelligence". Lecture Notes in Computer Science V. 2445. Berlin: Springer-Verlag.
  • Herrera, P., Peeters, G., Dubnov, S. (2003). Automatic Classification of Musical Instrument Sounds. Journal of New Music Research. 32(1), pp. 3-21.
  • Gómez, E. & Herrera, P. (2008). Comparative Analysis of Music Recordings from Western and Non-Western traditions by Automatic Tonal Feature Extraction. Empirical Musicology Review, 3(3), pp. 140-156.
  • Bogdanov, D., Wack, N., Gómez, E., Gulati S., Herrera, P., Mayor, O., Roma, G., Salamon, J., Zapata, J. & Serra, X. (2014). ESSENTIA: an open source library for audio analysis. ACM SIGMM Records. 6(1).
  • Bogdanov, D., Serrà J., Wack N., Herrera P., & Serra X. (2011). Unifying Low-level and High-level Music Similarity Measures. IEEE Transactions on Multimedia. 4, 687-701.
  • Cano, P., Koppenberger, M., Le Groux, S., Ricard, J., Wack, N., Herrera, P. (2005). Nearest-neighbor sound annotation with a Wordnet taxonomy. Journal of Intelligent Information Systems, 24 (2), pp. 99-111.
  • Serrà, J., Gómez, E., Herrera, P., Serra, X. (2008). Chroma binary similarity and local alignment applied to cover song identification. IEEE Transactions on Audio, Speech, and Language Processing, 16(6), pp. 1138-1151.
  • Laurier, C., Meyers, O., Serrà, J., Blech, M., Herrera, P., Serra, X. (2010). Indexing Music by Mood: Design and Integration of an Automatic Content-based Annotator. Multimedia Tools and Applications. 48(1), 161-184.
  • Koelsch, S., Skouras S., Fritz T., Herrera, P., Bonhage, C., Küssner, M. B. & Jacobs, A.M. (2013). The roles of superficial amygdala and auditory cortex in music-evoked fear and joy. NeuroImage. 81(1), 49-60.
  • Bogdanov, D., Haro, M., Fuhrmann, F., Xambó, A., Gómez, E. & Herrera, P. (2013) Semantic content-based music recommendation and visualization based on user preference examples. Information Processing and Management, 49(1), 13-33.
  • Herrera, P., Resa Z., & Sordo M. (2010). Rocking around the clock eight days a week: an exploration of temporal patterns of music listening. 1st Workshop on Music Recommendation and Discovery (WOMRAD), ACM RecSys, 2010, Barcelona, Spain.
  • Nuanáin, C. Ó., Herrera P., & Jordà S. (2017). Rhythmic Concatenative Synthesis for Electronic Music: Techniques, Implementation, and Evaluation. Computer Music Journal. 41(2), 21-37.
preprint/postprint documenthttps://zenodo.org/record/1882316