News and Events

Seminar by Clarence Barlow on his compositional approaches
2 Jul 2015

The composer Clarence Barlow will give a seminar entitled "Visualising Sound – Sonifying the Visual" on Thursday July 2nd at 15:30h in room 55.410 of the Communication Campus of the UPF.

The visualisation of musical sound can be pursued in a number of ways, two of which are rooted in technical considerations: the music could be intended for human performance, resulting in the development of a prescriptive performance score. If electroacoustic components are present in this music, these are often included as a graphic depiction (e.g. of the sound wave or as a sonagramme, or one reflecting the compositional methods etc.), thus more properly fulfilling a second main function – a descriptive one, mainly used in documentations, lectures and/or study scores (e.g. Ligeti’s Artikulation or Stockhausen’s Study II), but also sometimes prescriptively as part of a sort of (re-) construction kit.
Two other different approaches are rooted in the aesthetical (and possibly the synaesthetical): in sound visualisation it could simply be sound-derived images that satisfy; in the converse, image sonification, it could instead be the pleasure of extracting convincing music from optical sources, a comparison of source and result adding to the enjoyment. In multimedia such as film it could be the counterpoint of sound and image that pleases, especially if these are clearly bonded to another as when sound visualisation or image sonification is involved.
In the above, the vectors prescription-description and visualisation-sonification can work both ways, i.e. a prescriptive score is also potentially descriptive, one could (re-)imagine a visualised sound aurally, a sonified image visually.
In this presentation I would like to concentrate on these latter (syn-)aesthetic aspects as exemplified in my own work of several decades, having long been fascinated by the links between sound and image. These links mainly involve the concepts of  Position and Motion as well as of Colour, all of which are not only important aspects of music but fundamentally spatial, ultimately visual concepts: in musical contexts one speaks of “high” and “low”, of “fast” and “slow” (all of which comprise spatial terms – for instance the tempo indication andante literally means “walking”) as well as of “bright” and “dark” sounds and of “sound-colour”. Starting over thirty years ago, I have especially in recent times been repeatedly drawn to enacting these parallels. The first five examples are of sound visualisation, the last five (plus a footnote) of image sonification.

26 Jun 2015 - 13:53 | view
SMC Master Thesis Defenses
25 Jun 2015 - 30 Jun 2015

The oral presentations of the SMC Master Thesis will take place this Thursday 25th, Friday 26th and next Tuesday 30th of June from 9:30h to 12:30h in room 55.309. The defenses are public and all MTG researchers are encouraged to attend.

Student name

Title (tentative)





Computational modeling of hearing loss in musicians

Enric Guaus

June 25th



Computational assessment of noise environment in pop/rock band's layout

Enric Guaus

June 25th



Reducing Bias in Annotation Estimates for Probabilistic Evaluation of Audio Music Similarity

Julian Urbano / Emilia Gutierrez

June 25th



Electronic music genre classification

Perfecto Herrera

June 25th



Composer identification

Joan Serrà

June 25th



Measurement and computational model of the maximum stable gain in acoustic feedback scenarios

Nadine Kroher / Enric Giné

June 26th



Similarity Search for Sound Effects in Freesound: A unified model for Tags and Content

Xavier Serra / Frederic Font

June 26th



Generating Singing Voice Expression Based On Machine Learning

Martí Umbert / Merlijn Blaauw

June 26th



Source Localization for Enhancement of Orchestral Music from Multi-sensor Recordings

Julio Carabias / Jordi Janer

June 26th



Source Separation ­based music processing for assistive listening

Jordi Janer / Waldo Nogueira

June 26th



Lemur Synthesis with context-dependent HMMs

Jordi Bonada / Marco Gamba

June 26th



Automatic drum sound classification

Perfecto Herrera

June 30th



Content Based Electronic Music Session Reconstruction

Perfecto Herrera

June 30th



A Platform to creatively explore Frequently Used Samples in Rap Music History

Perfecto Herrera

June 30th



Hexaphonic Guitar Transcription and Visualization

Rafael Ramirez

June 30th



Transcription of Percussion Patterns in Hindustani Classical Music

Xavier Serra / Ajay Srinivasamurthy

June 30th



22 Jun 2015 - 14:37 | view
Seminar by engineers from SoundCloud on their approach to data products
22 Jun 2015

Several engineers from SoundCloud will give a seminar on "Building Data Products at SoundCloud" on Monday 22nd at 18:30h in room 55.410.

Presenters: Josh Devins, Rany Keddo, Dr. Özgür Demir, Dr. Alexey Rodriguez Yakushev, Dr. Christoph Sawade

Abstract: Serving over 150M listeners every month, SoundCloud is the world’s leading audio platform. We have a vibrant community of creators uploading new and unique content every second of the day. To capitalise on the volume and variety of content on the platform, we must rely on a number of tools, techniques and approaches to building products.

In this talk we will present the general framework of how we approach building data products at SoundCloud. We will review two case studies of recent work as examples of applying our approach to the topics of discovering new content with personalised recommendations and applying structured metadata using genre classification.
16 Jun 2015 - 10:34 | view
Keynote by Emilia Gomez at MCM2015

Emilia Gómez has been invited to give a keynote speech at the Fifth Biennial International Conference on Mathematics and Computation in Music (MCM2015) which will be held on 22-25 June-2015 at Queen Mary University of London. The talk is on "Computational models of symphonic music: challenges and opportunities" and relates to the objectives and results achieved in the European project PHENICX which she leads.

The MCM2015 conference brings together researchers from around the world who combine mathematics or computation with music theory, music analysis, composition and performance. MCM provides a dedicated platform for the communication and exchange of ideas amongst researchers in mathematics, informatics, music theory, composition, musicology, and related disciplines.

10 Jun 2015 - 21:09 | view
MTG involvement at Sonar+D

As in the past few years, the MTG is involved in a number of activities included in the Festival Sonar that takes place from June 18th to 20th, specifically in its professional area Sonar+D.

Here you have a briefing about what's happening during the following week:

When? Where? What?
June 16th Museu de la Música

“Barcelona, Capital Electrònica” discussion and “Voices” concert, with occassion of Phonos 40th anniversary.

The round table will have as participants John Chowning, researcher at the CCRMA (Computer Research in Music and Acoustics) at Stanford University; Andrés Lewin-Richter, composer and member of the Phonos Foundation; Enric Palau, co-founder and co-director of Sónar; and Xavier Serra will be the moderator.

The concert “Voices” based on ancient Greek mythological literature texts where John Chowning will process the voice of soprano Maureen Chowning with a computer. 

June 17th Hangar  Pre-event meeting of the Music Hack Day. This pre-event includes workshops on sensing, design and digital audio tools, talks by key figures on the field, and performances by artists exploring the intersection of music and wearable computing. This pre-event workshop is sponsored by the European Projects #MusicBricks and RAPID-MIX.
June 18th-19th Sonar+D

This year the Barcelona Music Hack Day will offer a special track on wearable and multimodal technology applied to music creation and performance. This special track will bring together experts on bio and motion sensing, interaction design and wearable interface prototyping.

100 selected hackers will be conceptualising and developing new tools for music creation and performance using the tools provided by more than 20 companies at international level.

June 19th Sonar+D 

Talk about "De-westernalizing music through technology" in which Kalbata (Ariel Tagar), music producer and record label owner, Peter Kirn, founder and editor of Create Digital Music, and Xavier Serra will discuss about how music technology may provide a greater understanding and appreciation of all the musical traditions with non-Western roots.


9 Jun 2015 - 11:06 | view
Seminar by John Chowning on his computer music works
15 Jun 2015
John Chowning, father of FM synthesis and Computer Music Pioneer, will give a talk on "Composing music from the inside-out" on Monday June 15th at 15:00h in the Auditorium of the Communication Campus of the UPF. He will also participate in a round table on June 16th at 18:30h at the Barcelona Music Museum where his piece "Voices" will also be performed.
Lecture abstract: A lecture/demonstration showing how the capacity of computer systems 50 years ago limited composers/researchers to only one of the sound generating processes that are available today: synthesis.   But, we learned much about the perception of sound as we wrapped our aural skills around the technology and discovered how to create music from fundamental units. Using sound synchronous animated slides, I will demonstrate how my earliest work in spatialization led to the discovery of FM synthesis in 1967.  Their development gave rise to perceptual insights that led to the synthesis of the singing voice on which both Phonē (1981) and Voices (2011) depend.  Beginning with Stria (1977), the scale (pitch space) and the inharmonic timbres (spectral space) are rooted in the Golden Ratio.  On these underpinnings I composed my music—based on what was known and what we learned about perception—from the inside-out. With the participation of Maureen Chowning, the presentation will conclude with a demonstration of the workings of the MaxMSP patch complex that accompanies the solo soprano in Voices—“hands free.”
Biography: Chowning is professor emeritus at Stanford University and founding director of the Center for Computer Research in Music and Acoustics (CCRMA). One of the great pioneers of computer music, he was among the first to realize the tremendous musical possibilities of the digital computer. His discovery of the FM synthesis algorithm in 1967 was licensed to Yamaha and popularized in the most successful synthesis engine in the history of electronic instruments, enabling precise control over new realms of sonic possibilities. Chowning’s innovations also extend into sound spatialization and musical temperament.
8 Jun 2015 - 09:43 | view
Frederic Font defends his PhD thesis on June 11th
11 Jun 2015
Frederic Font defends his PhD thesis entitled "Tag Recommendation using Folksonomy Information for Online Sound Sharing Platforms" on Thursday June 11th 2015 at 11:00h in room 55.410 of the Communication Campus of the UPF.
The jury of the defense is: Mark Sandler (Queen Mary, London), Sergi Jordà (UPF), Iván Cantador (UAM, Madrid)
Thesis abstract: Online sharing platforms host a vast amount of multimedia content generated by its own users. Such content is typically not uniformly annotated and can not be straightforwardly indexed. Therefore, making it accessible to other users poses a real challenge which is not specific of online sharing platforms. In general, content annotation is a common problem in all kinds of information systems. In this thesis, we focus on this problem and propose methods for helping users to annotate the resources they create in a more comprehensive and uniform way. Specifically, we work with tagging systems and propose methods for recommending tags to the content creators during the annotation process. To this end, we exploit information gathered from previous resource annotations in the same sharing platform, the so called folksonomy. Tag recommendation is evaluated using several methodologies, with and without the intervention of users, and in the context of large-scale tagging systems. We focus on the case of tag recommendation for sound sharing platforms and, besides studying the performance of several methods in this scenario, we analyse the impact of one of our proposed methods on the tagging system of a real-world and large-scale sound sharing site. As an outcome of this thesis, one of the proposed tag recommendation methods is now being daily used by hundreds of users in this sound sharing site. Furthermore, we explore a new perspective for tag recommendation which, besides taking advantage of information from the folksonomy, employs a sound-specific ontology to guide users during the annotation process. Overall, this thesis contributes to the advancement of the state of the art in tagging systems and folksonomy-based tag recommendation and explores interesting directions for future research. Even though our research is motivated by the particular challenges of sound sharing platforms and mainly carried out in that context, we believe our methodologies can be easily generalised and thus be of use to other information sharing platforms.
4 Jun 2015 - 15:38 | view
Xavier Serra gives a seminar on his research career
4 Jun 2015

Xavier Serra gives a seminar entitled "Research highlights in my journey within the field of Sound and Music Computing" on June 4th at 11:30h in the Auditorium of the Poblenou Campus of the UPF as part of the DTIC Integrative Research Seminars.


In this presentation I will go over some of the research I have been involved with in my thirty-year career within the field of Sound and Music Computing, emphasizing the goals I aimed for and identifying some of the results obtained.
My personal research, and the one I have directly supervised, has been mainly focused on the analysis, description and synthesis of sound and music signals. My initial efforts were dedicated to analyze and transform complex and musically relevant sounds; sounds that were not well captured by the audio processing techniques used at that time. My approach was to use spectral analysis and synthesis techniques to develop a deterministic plus stochastic model with which to obtain sonically and musically meaningful parameterizations and descriptions. That work had practical applications for synthesizing and transforming a wide variety of sounds, including the voice.
As a natural evolution of that initial research I became interested in going from single sounds to collections of sounds, thus being able to describe and model the relationships between sound entities. To tackle that I had to incorporate methodologies coming from disciplines such as machine learning and semantic technologies in order to complement the signal processing approaches I was using. A major bottleneck for carrying out that research was the availability of large and adequate audio collections. To solve that we developed a platform for collecting and sharing sounds and then using the collected sounds we carried out research on the automatic description of sound signals. We now have large publicly available sound collections and we have developed information retrieval tools of relevance for several sound and music applications.
In the last few years, and in the context of the music information retrieval field, it has become evident that to make sense of sound and music data we need to incorporate domain knowledge. In order to improve the distance measures used in exploration and retrieval tasks we need to incorporate the knowledge that exist around sound and music. To that aim, most of my current projects focus on the study of specific sound and music repertories and on targeting well-defined tasks, trying to formalize and represent user knowledge with which to develop applications of relevance to those users. We are putting together coherent corpora; we are working on audio analysis techniques specific for these corpora and chosen tasks; we are gathering and analyzing user and contextual information to train our data models; and we are developing task-oriented tools to interact with particular sound and music collections. Our results show the benefit of this type of information processing research in which bottom up and top down approaches are combined.
My research has always been motivated by music, by the interest of developing musical tools that can be socially and culturally relevant. In this talk I want to emphasize this aspect while talking about my thirty-year research journey.

Video of the talk:

26 May 2015 - 03:09 | view
Gerard Roma defends his PhD thesis on June 5th
5 Jun 2015
Gerard Roma defends his PhD thesis entitled "Algorithms and representations for supporting online music creation
with large-scale audio databases" on Friday June 5th 2015 at 11:00h in room 55.309 of the Communication Campus of the UPF.
The jury of the defense is: Sergi Jordà (UPF), Diemo Schwarz (IRCAM), Enric Guaus (ESMUC)
Thesis abstract:
The rapid adoption of Internet and web technologies has created an opportunity for making music collaboratively by sharing information online. However, current applications for online music making do not take advantage of the potential of shared information. The goal of this dissertation is to provide and evaluate algorithms and representations for interacting with large audio databases that facilitate music creation by online communities. This work has been developed in the context of Freesound, a large-scale, community-driven database of audio recordings shared under Creative Commons (CC) licenses. The diversity of sounds available through this kind of platform is unprecedented. At the same time, the unstructured nature ofcommunity-driven processes poses new challenges for indexing and retrieving information to support musical creativity. In this dissertation we propose and evaluate algorithms and representations for dealing with the main elements required by online music making applications based on large-scale audio databases: sound files, including time-varying and aggregate representations, taxonomies for retrieving sounds, music representations and community models. As a generic low-level representation for audio signals, we analyze the framework of cepstral coefficients, evaluating their performance with example classification tasks. We found that switching to more recent auditory filter such as gammatone filters improves, at large scales, on traditional representations based on the mel scale. We then consider common types of sounds for obtaining aggregated representations. We show that several time series analysis features computed from the cepstral coefficients complement traditional statistics for improved performance. For interacting with large databases of sounds, we propose a novel unsupervised algorithm that automatically generates taxonomical organizations based on the low-level signal representations. Based on user studies, we show that our approach can be used in place of traditional supervised classification approaches for providing a lexicon of acoustic categories suitable for creative applications. Next, a computational representation is described for music based on audio samples. We demonstrate through a user experiment that it facilitates collaborative creation and supports computational analysis using the lexicons generated by sound taxonomies. Finally, we deal with representation and analysis of user communities. We propose a method for measuring collective creativity in audio sharing. By analyzing the activity of the Freesound community over a period of more than 5 years, we show that the proposed creativity measures can be significantly related to social structure characterized by network analysis.


26 May 2015 - 03:00 | view
Let's film a concert together! Phenicx Collaborative Video Festival
26 May 2015

*** Catalan version below ***

The Catalan Music School (ESMUC), in collaboration with the company Video Dock (Amsterdam) and the MTG, organizes a collaborative Video Festival in the context of the PHENICX project tomorrow Tuesday May 26th at 19:00, it will take place at the outdoors stage @ L’Auditori (Padilla 155, 08013 Barcelona).

The dynamic of this event consists of letting the audience record video with their mobile phones featuring the most interesting moments of a concert (Jazz concert by the ESMUC big band). All those videos will be sent to a common repository and synchronized in order to build a collaborative mashup video of the concert based on the most relevant excerpts following the audience criteria.

This data will be used in our research on user-generated content, video synchronization and quality description.

MTG researchers involved are Enric Guaus (PHENICX leader at ESMUC) and Iwona Sobieraj, with the support of the PHENICX Barcelona team.

Please join us and come to film!


L'Escola Superior de Música de Catalunya (ESMUC), en col·laboració amb l'empresa de VideoDock d'Amsterdam i l'MTG, organitza un Festival de Vídeo Col·laboratiu en el marc del projecte PHENICX demà Dimarts 26 de maig a les 19:00 a l'escenari exterior de l'auditori (Padilla 155, 08013 Barcelona).

La dinàmica d'aquest esdeveniment consisteix en fomentar que l'audiència enregistri amb els seus telèfons mòbils els moments més interessants d'un concert de Jazz de la mà de la big band de l'ESMUC. Tots els vídeos de l'audiència s'enviarana un repositori comú i seran sincronitzats amb la finalitat de construir un vídeo col·laboratiu del concert en base a totes les contribucions de l'audiència.

Aquestes dades seran utilitzades en la nostra investigació sobre contingut generat per l'usuari, sincronització de vídeo i descripció automàtica del contingut.

Els investigadors que participen en aquesta activitat són Enric Guaus (líder PHENICX a l'ESMUC) i Iwona Sobieraj (estudiant PhD MTG), amb el suport de tot l'equip de PHENICX de Barcelona.

Animeu-vos a participar en el concert!

25 May 2015 - 09:38 | view