News and Events

SMC students winners in the Bulgaria Music Hackathon

Sound and Music Computing Master's students Manaswi Mishra, Kushagra Sharma and Siddharth Bhardwaj won the Music Vision category in the Bulgaria Music Hackathon with the project "Samsara - a virtual interactive soundscape", and got an opportunity of a sponsored visit to the monthly Music Hackathon in New York City, USA.

This project was a virtual interactive soundscape where multiple performers can create and manipulate music through their smartphones/touch interfaces. The virtual world is a physics based system where different types of virtual atoms interact with each other and the performer. Each performer has a role in the collaborative soundscape - a creator (sound generator atoms), a preserver (filters and fx atoms) and a destroyer (destroying the number of generators (creator atoms) in the system). Performers can create, delete and manipulate their type of atoms in the dynamic system of moving atoms and collisions to create a collaborative soundscape.

More information about the project can be found here: Presentation at Hackathon.

And a report about the event and interview on Live Bulgarian national TV: Eurocom TV interview.



13 Feb 2017 - 16:34 | view
Tenure Track position in Machine Learning at UPF

The Unit of Engineering in Information and Communication Technologies (ETIC) ( at the Pompeu Fabra University in Barcelona is opening an academic position in Machine Learning.

ETIC has a highly interdisciplinary research environment with faculty interests covering a broad range of topics that can be broadly grouped into four main areas where data-driven research is transversal to all of them: (

Multimedia Technologies, covering topics in image processing and cinematography, sound and music computing, human-computer Interaction, graphics and educational technologies and cognition as it relates to multimedia.

Computational Biology and Biomedical Systems including topics such as computational neuroscience, analysis of biomedical data, nonlinear signal analysis in biological systems, instrumentation and biomedical electronics, computational simulation and biomechanics, medical imaging and modeling of complex biomedical systems.

Computation and Intelligent Systems covering foundations of computer science, several aspects of artificial intelligence such as planning, natural language processing, computer vision, machine learning, ubiquitous computing, information retrieval and data mining, and human cognition and its relation to robotics. 

Networking and Communications with topics related to both wired and wireless networks, network science, information theory and coding, as well as policy aspects and strategies related to networking technologies.

Currently ETIC is strengthening its research interests in these four areas with special interest in Machine Learning. We are looking for a junior scientist with strong research interests in any of these areas but with special focus in Machine Learning, and we seek applications for a Tenure-Track position. The appointed applicant will progressively lead a new research group, in close collaboration with the existing ones, being involved in the undergraduate and postgraduate teaching duties of the Department, in particular with the deployment of the new Degree in Mathematical Engineering of Data Science.

Candidates should hold a PhD degree and have an excellent scientific-technical background in Machine Learning, including (but not restricted to) topics such as Multimedia Technologies, Computational Biology and Biomedical Systems, Computation and Intelligent Systems or Networking and Communications. The candidate should also have interest in confronting and resolving scientific problems/challenges, and leading several research publications in major journals.

A motivation letter, curriculum vitae, list of publications, research plan and addresses of three putative referees should be sent as a single pdf to randp [dot] dtic [at] upf [dot] edu, before March 24th 2017.

9 Feb 2017 - 11:20 | view
Escolab 2017: presentation of 28x28 project

We will participate in Escolab 2017 presenting the project "28x28" developed by the artist Xavier Bonfill and the MTG researcher Frederic Font with the support of Phonos.

28x28 make use of Freesound and Essentia to develop a system that allows to create interactive real-time music compositions by playing a domino game.

The activity will be held in the Sala Polivalent where, after an introduction to the technical and artistic work behind the project, the participants will be able to try the installation and create their own music work.

This project shows the type of interaction between research and artistic creation promoted by the MTG.

Information: Escolab UPF

7 Feb 2017 - 17:24 | view
Gopala K. Koduri and Sertan Şentürk defend their PhD thesis
22 Feb 2017

Wednesday, February 22nd 2017 at 11:00h in room 55.309 (Tanger Building, UPF Communication Campus)

Gopala K. Koduri: “Towards a multimodal knowledge base for Indian art music: A case study with melodic intonation”
Thesis director: Xavier Serra
Thesis Committee: Anja Volk (Utrecht University), Baris Bozkurt (Koç University) and George Fazekas (QMUL)
Abstract: This thesis is a result of our research efforts in building a multi-modal knowledge-base for the specific case of Carnatic music. Besides making use of metadata and symbolic notations, we process natural language text and audio data to extract culturally relevant and musically meaningful information and structuring it with formal knowledge representations. This process broadly consists of two parts. In the first part, we analyze the audio recordings for intonation description of pitches used in the performances. We conduct a thorough survey and evaluation of the previously proposed pitch distribution based approaches on a common dataset, outlining their merits and limitations. We propose a new data model to describe pitches to overcome the shortcomings identified. This expands the perspective of the note model in-vogue to cater to the conceptualization of melodic space in Carnatic music. We put forward three different approaches to retrieve compact description of pitches used in a given recording employing our data model. We qualitatively evaluate our approaches comparing the representations of pitched obtained from our approach with those from a manually labeled dataset, showing that our data model and approaches have resulted in representations that are very similar to the latter. Further, in a raaga classification task on the largest Carnatic music dataset so far, two of our approaches are shown to outperform the state-of-the-art by a statistically significant margin.
In the second part, we develop knowledge representations for various concepts in Carnatic music, with a particular emphasis on the melodic framework. We discuss the limitations of the current semantic web technologies in expressing the order in sequential data that curtails the application of logical inference. We present our use of rule languages to overcome this limitation to a certain extent. We then use open information extraction systems to retrieve concepts, entities and their relationships from natural language text concerning Carnatic music. We evaluate these systems using the concepts and relations from knowledge representations we have developed, and groundtruth curated using Wikipedia data. Thematic domains like Carnatic music have limited volume of data available online. Considering that these systems are built for web-scale data where repetitions are taken advantage of, we compare their performances qualitatively and quantitatively, emphasizing characteristics desired for cases such as this. The retrieved concepts and entities are mapped to those in the metadata. In the final step, using the knowledge representations developed, we publish and integrate the information obtained from different modalities to a knowledge-base. On this resource, we demonstrate how linking information from different modalities allows us to deduce conclusions which otherwise would not have been possible.

Wednesday, February 22nd 2017 at 16:00h in room 55.309 (Tanger Building, UPF Communication Campus)

Sertan Şentürk: “Computational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music”
Thesis director: Xavier Serra
Thesis Committee: Gerhard Widmer (Johannes Kepler University), Baris Bozkurt (Koç University) and Tillman Weyde (City, University of London)
Abstract: This thesis addresses several shortcomings on the current state of the art methodologies in music information retrieval (MIR). In particular, it proposes several computational approaches to automatically analyze and describe music scores and audio recordings of Ottoman-Turkish makam music (OTMM). The main contributions of the thesis are the music corpus that has been created to carry out the research and the audio-score alignment methodology developed for the analysis of the corpus. In addition, several novel computational analysis methodologies are presented in the context of common MIR tasks of relevance for OTMM. Some example tasks are predominant melody extraction, tonic identification, tempo estimation, makam recognition, tuning analysis, structural analysis and melodic progression analysis. These methodologies become a part of a complete system called Dunya-makam for the exploration of large corpora of OTMM.
The thesis starts by presenting the created CompMusic Ottoman-Turkish makam music corpus. The corpus includes 2200 music scores, more than 6500 audio recordings, and accompanying metadata. The data has been collected, annotated and curated with the help of music experts. Using criteria such as completeness, coverage and quality, we validate the corpus and show its research potential. In fact, our corpus is the largest and most representative resource of OTMM that can be used for computational research. Several test datasets have also been created from the corpus to develop and evaluate the specific methodologies proposed for different computational tasks addressed in the thesis.
The part focusing on the analysis of music scores is centered on phrase and section level structural analysis. Phrase boundaries are automatically identified using an existing state-of-the-art segmentation methodology. Section boundaries are extracted using heuristics specific to the formatting of the music scores. Subsequently, a novel method based on graph analysis is used to establish similarities across these structural elements in terms of melody and lyrics, and to label the relations semiotically. 
The audio analysis section of the thesis reviews the state-of-the-art for analysing the melodic aspects of performances of OTMM. It proposes adaptations of existing predominant melody extraction methods tailored to OTMM. It also presents improvements over pitch-distribution-based tonic identification and makam recognition methodologies. 
The audio-score alignment methodology is the core of the thesis. It addresses the culture-specific challenges posed by the musical characteristics, music theory related representations and oral praxis of OTMM. Based on several techniques such as subsequence dynamic time warping, Hough transform and variable-length Markov models, the audio-score alignment methodology is designed to handle the structural differences between music scores and audio recordings. The method is robust to the presence of non-notated melodic expressions, tempo deviations within the music performances, and differences in tonic and tuning. The methodology utilizes the outputs of the score and audio analysis, and links the audio and the symbolic data. In addition, the alignment methodology is used to obtain score-informed description of audio recordings. The score-informed audio analysis not only simplifies the audio feature extraction steps that would require sophisticated audio processing approaches, but also substantially improves the performance compared with results obtained from the state-of-the-art methods solely relying on audio data.
The analysis methodologies presented in the thesis are applied to the CompMusic Ottoman-Turkish makam music corpus and integrated into a web application aimed at culture-aware music discovery. Some of the methodologies have already been applied to other music traditions such as Hindustani, Carnatic and Greek music. Following open research best practices, all the created data, software tools and analysis results are openly available. The methodologies, the tools and the corpus itself provide vast opportunities for future research in many fields such as music information retrieval, computational musicology and music education.
6 Feb 2017 - 12:43 | view
CompMusic Seminar
23 Feb 2017

On February 23rd 2017, Thursday, from 9:30h to 14:00h in room 55.309 of the Communication Campus of the Universitat Pompeu Fabra in Barcelona, we will have a CompMusic seminar. This seminar accompanies the PhD thesis defenses of Gopala Krishna Koduri and Sertan Şentürk that takes place the previous day.

9:30 Gerhard Widmer (Johannes Kepler University, Linz, Austria)
"Con Espressione! - An Update from the Computational Performance Modelling Front"
Computational models of expressive music performance have been a target of considerable research efforts in the past 20 years. Motivated by the desire to gain a deeper understanding of the workings of this complex art, various research groups have proposed different classes of computational models (rule-based, case-based, machine-learning-based) for different parametric dimensions of expressive performance, and it has been demonstrated in various studies that such models can provide interesting new insights into this musical art. In this presentation, I will review recent work that has carried  this research further. I will mostly focus on a general modelling framework known as the "Basis Mixer", and show various extensions of this model that have gradually increased the modelling power of the framework. However, it will also become apparent that are still serious limitations and obstacles on the path to comprehensive models of musical expressivity, and I will briefly report on a new ERC project entitled "Con Espressione", which expressly addresses these challenges. Along the way, we will also hear about a recent musical "Turing Test" that is said to demonstrate that computational performance models have now reached a level where their interpretations of classical piano music are qualitatively indistinguishable from true human performances -- a story that I will quickly try to put into perspective ...
10:30 Tillman Weyde (City, University of London, UK)
"Digital Musicology with Large Datasets"
The increasing availability of music data as well as networks and computing resources has the potential to profoundly change the methodology of musicological research towards a more data-driven empirical approach. However, many questions are still unanswered regarding the technology, data collection and provision, metadata, analysis methods and legal aspects. This talk will report on an effort to address these questions in the Digital Music Lab project, and present achieved outcomes, lessons learnt and challenges that emerged in this process. 
11:30 Coffee break
12:00 Anja Volk (Utrecht University, Netherlands)
"The explication of musical knowledge through automatic pattern finding"
In this talk I will discuss the role of computational modeling for gaining insights into the specifics of a musical style for which there exists no long-standing music theory such as in Western classical music, Carnatic music or Ottoman-Turkish makam music. Specifically, I address the role of automatic pattern search in enabling us to scrutinize what it is that we really know about a specific music style, if we consider ourselves to be musical experts. I elaborate my hypothesis that musical knowledge is often implicit, while computation enables us to make part of this knowledge explicit and evaluate it on a data set. This talk will address the explication of musical knowledge for the question as to when we perceive two folk melodies to be variants of each other for the case of Dutch and Irish folk songs, and when we consider a piece to be a ragtime. With examples from research within my VIDI-project MUSIVA on patterns in these musical styles, I discuss how musical experts and non-experts working together on developing computational methods can gain important insights into the specifics of a musical style, and the implicit knowledge of musical experts. 
13:00 György Fazekas (Queen Mary, University of London, UK)
"Convergence of Technologies to Connect Audio with Meaning: from Semantic Web Ontologies to Semantic Audio Production”
Science and technology plays in an increasingly vital role in how we experience, how we compose, perform, share and enjoy musical audio. The invention of recording in the late 19th century is a profound example that, for the first time in human history, disconnected music performance from listening and gave rise to a new industry as well as new fields of scientific investigation. But musical experience is not just about listening. Human minds make sense of what we hear by categorising and by making associations, cognitive processes which give rise to meaning or influence our mood. Perhaps the next revolution akin to recording is therefore in audio semantics. Technologies that mimic our abilities and enable interaction with audio on human terms are already changing the way we experience it. The emerging field of Semantic Audio is in the confluence of several key fields, namely, signal processing, machine learning and Semantic Web ontologies that enable knowledge representation and logic-based inference. In my talk, I will put forward that synergies between these fields provide a fruitful way, if not necessary to account for human interpretation of sound. I will outline music and audio related ontologies and ontology based systems that found applications on the Semantic Web, as well as intelligent audio production tools that enable linking musical concepts with signal processing parameters in audio systems. I will outline my recent work demonstrating how web technologies may be used to create interactive performance systems that enable mood-based audience-performer communication and how semantic audio technologies enable us to link social tags and audio features to better understand the relationship between music and emotions. I will hint at how some principles used in my research also contribute to enhancing scientific protocols, ease experimentation and facilitate reproducibility. Finally, I will discuss challenges in fusing audio and semantic technologies and outline some future opportunities they may bring about.
1 Feb 2017 - 13:35 | view
Tutorial - Natural Language Processing for Music Information Retrieval
30 Jan 2017

In this tutorial, we will focus on linguistic, semantic and statistical-­based approaches to extract and formalize knowledge about music from naturally occurring text. We propose to provide the audience with a preliminary introduction to NLP, covering its main tasks along with the state­-of-­the-­art and most recent developments. In addition, we will showcase the main challenges that the music domain poses to the different NLP tasks, and the already developed methodologies for leveraging them in MIR and musicological applications.

  • Date: January 30th 2017. 14:30h - 17:30h
  • Location: Poblenou Campus, UPF (Roc Boronat 138, Barcelona). Room 52.S27
  • Tutorial presenters: Sergio Oramas, Luis Espinosa (Music meets NLP MdM project)

Updated version of the tutorial presented at ISMIR2016.

Free registration here.

24 Jan 2017 - 10:22 | view
Carolina Foundation: scholarship program (2017-2018) for ibero-american students

The Carolina Foundation launches a new fellowship program (2017-2018) for ibero-american students aim to complete their education in Spain. This program will offer 521 scholarships.

If you are interested to apply, find more information on the ETIC website.

11 Jan 2017 - 16:10 | view
Music Technology Group - report 2016

This year 2016 the MTG has been involved in a significant number of projects and activities, and its members have been very active in promoting the reasearch through outreach activties, publications and conferences. The following report presents some relevant indicators that reflect the overall activity of the group and resources during 2016. This report is in line with our open data and transparency policy.


MTG members

Faculty 4
Postdoc 17
PhD students 20
Master student internships 6
Developers 7
Administration 2
Others 3
Visitors 8

Total members 2016 (excluding visitors) = 59 people

MTG members 2012 to 2016


Revenue for competitive projects

Total revenue for public funded competitive projects 2016 = 1.543.747€

Revenue for competitive projects 2012 to 2016


Research and innovation projects

European projects National projects Private company projects
9 3 2

AudioCommons, CAMUT, CompMusic, Giant Steps, MusicBricks, MUSMAP, Phenicx, Rapid-Mix, TELMI

CASAS, Mingus, Timul

Korg, Yamaha







Total projects 2016 = 14

Projects by category 2012 to 2016



PhD thesis:
10 thesis defenses during 2016
(cumulative 44 thesis)
85 publications during 2016
(cumulative 1.148)
Participation in 23 different conferences:
Outreach activities:
Participation in more than 15 outreach activities open to students, professional audience or general society, including, amongst others, Festa de la Ciència, Setmana de la Ciència, Music Tech Fest, Pint of Science, Sonar festival, Mutek festival and organization of several public events.
Award from the Board of Trustees of the UPF in Knowledge Transfer category: S. Gulati, G. Koduri
Singing voice challenge, Interspeech: J. Bonada, M. Blauw
Best paper award, FMA: G. Dzhambazov, Y. Yang, R. Caro, X. Serra
Best paper award, NIME: C. O Nuanáin, S. Jordà, P. Herrera
Best paper award, CBMI: J. Pons, T. Lidy, X. Serra
22 Dec 2016 - 14:25 | view
Post-doctoral opportunities at the MTG

There are a number of possibilities to do a post-doc at the MTG, in particular:

1. Ramon y Cajal 2016. Post-doctoral positions funded by the Spanish government with which you can join a spanish research group like the MTG. For information and application:


2. Juan de la Cierva 2016. Post-doctoral positions for young doctors funded by the Spanish government with which you can join a Spanish research group like the MTG. For information and application:


3. Tenure-track position in Computer Science in the framework of the Maria de Maeztu Research Program of the Department of Information and Communication Technologies (DICT). For information and application:


4. Senior faculty position in Computer Science in the framework of the Maria de Maeztu Research Program of the Department of Information and Communication Technologies (DICT). For information and application:


7 Dec 2016 - 13:16 | view
Application open for the Master in Sound and Music Computing 2017-2018
28 Nov 2016 - 1 Jun 2017

The application for the Master in Sound and Music Computing, program 2017-2018, is open on-line. There are 4 application periods (deadlines: January 16th, March 10th, April 28th, June 1st). For more information on the UPF master programs and on how to register to the SMC Master check here. For other information on the SMC master check:

5 Dec 2016 - 11:01 | view