News and Events

Music Technology Group - report 2016

This year 2016 the MTG has been involved in a significant number of projects and activities, and its members have been very active in promoting the reasearch through outreach activties, publications and conferences. The following report presents some relevant indicators that reflect the overall activity of the group and resources during 2016. This report is in line with our open data and transparency policy.


MTG members

Faculty 4
Postdoc 17
PhD students 20
Master student internships 6
Developers 7
Administration 2
Others 3
Visitors 8

Total members 2016 (excluding visitors) = 59 people

MTG members 2012 to 2016


Revenue for competitive projects

Total revenue for public funded competitive projects 2016 = 1.543.747€

Revenue for competitive projects 2012 to 2016


Research and innovation projects

European projects National projects Private company projects
9 3 2

AudioCommons, CAMUT, CompMusic, Giant Steps, MusicBricks, MUSMAP, Phenicx, Rapid-Mix, TELMI

CASAS, Mingus, Timul

Korg, Yamaha







Total projects 2016 = 14

Projects by category 2012 to 2016



PhD thesis:
10 thesis defenses during 2016
(cumulative 44 thesis)
85 publications during 2016
(cumulative 1.148)
Participation in 23 different conferences:
Outreach activities:
Participation in more than 15 outreach activities open to students, professional audience or general society, including, amongst others, Festa de la Ciència, Setmana de la Ciència, Music Tech Fest, Pint of Science, Sonar festival, Mutek festival and organization of several public events.
Award from the Board of Trustees of the UPF in Knowledge Transfer category: S. Gulati, G. Koduri
Singing voice challenge, Interspeech: J. Bonada, M. Blauw
Best paper award, FMA: G. Dzhambazov, Y. Yang, R. Caro, X. Serra
Best paper award, NIME: C. O Nuanáin, S. Jordà, P. Herrera
Best paper award, CBMI: J. Pons, T. Lidy, X. Serra
22 Dec 2016 - 14:25 | view
Post-doctoral opportunities at the MTG

There are a number of possibilities to do a post-doc at the MTG, in particular:

1. Ramon y Cajal 2016. Post-doctoral positions funded by the Spanish government with which you can join a spanish research group like the MTG. For information and application:


2. Juan de la Cierva 2016. Post-doctoral positions for young doctors funded by the Spanish government with which you can join a Spanish research group like the MTG. For information and application:


3. Tenure-track position in Computer Science in the framework of the Maria de Maeztu Research Program of the Department of Information and Communication Technologies (DICT). For information and application:


4. Senior faculty position in Computer Science in the framework of the Maria de Maeztu Research Program of the Department of Information and Communication Technologies (DICT). For information and application:


7 Dec 2016 - 13:16 | view
Application open for the Master in Sound and Music Computing 2017-2018
28 Nov 2016 - 1 Jun 2017

The application for the Master in Sound and Music Computing, program 2017-2018, is open on-line. There are 4 application periods (deadlines: January 16th, March 10th, April 28th, June 1st). For more information on the UPF master programs and on how to register to the SMC Master check here. For other information on the SMC master check:

5 Dec 2016 - 11:01 | view
Possibility for 3 years postdocs @MTG for researchers outside Spain

The catalan government is opening a call for post-doc researchers to join catalan universities. It is called Beatriu de Pinós program.


  • Have a PhD between 01/01/2009 and 31/12/2014 (even later)
  • Minimum of 2 years of postdoctoral experience outside Spain.
  • Not living in Spain more than 12 months in the lsat 3 years.


  • 2 years duration that can be extended 1 more year. Starting before January 1st 2018.
  • 32.800 EUR / year + 6.000 EUR for supporting research

Deadline: 01/12/2016

More info here.

21 Nov 2016 - 13:33 | view
Master thesis from SMC Master 2015-2016
15 Nov 2016 - 11:02 | view
Talks by Dr. Eita Nakamura and Dr. Shinji Sako
15 Nov 2016

Dr. Eita Nakamura (Kyoto University, Japan) and Dr. Shinji Sako (Nagoya Institute of Technology, Japan)
will be giving two talks:


"Rhythm Transcription of Piano Performances Based on Hierarchical. Bayesian Modelling of Repetition and Modification of Musical Note Patterns" by Dr. Eita Nakamura. Kyoto University, Japan. (15h Nov, 17:00h. Room 52.321)

We present a method of rhythm transcription (i.e., automatic recognition of note values in music performance signals) based on a Bayesian music language model that describes the repetitive structure of musical notes. Conventionally, music language models for music transcription are trained with a dataset of musical pieces. Because typical musical pieces have repetitions consisting of a limited number of note patterns, better models fitting individual pieces could be obtained by inducing compact grammars. The main challenges are inducing appropriate grammar for a score that is observed indirectly through a performance and capturing incomplete repetitions, which can be represented as repetitions with modifications. We propose a hierarchical Bayesian model in which the generation of a language model is described with a Dirichlet process and the production of musical notes is described with a hierarchical hidden Markov model (HMM) that incorporates the process of modifying note patterns. We derive an efficient algorithm based on Gibbs sampling for simultaneously inferring from a performance signal the score and the individual language model behind it. Evaluations showed that the proposed model outperformed previously studied HMM-based models.


"Real-time audio-to-score following and its applications" by Dr. Shinji Sako (and his students). Nagoya Institute of Technology, Japan. (15th Nov, 17:45 h. Room 52.321)

We present a robust on-line algorithm for real-time audio-to-score following based on a delayed decision and anticipation framework. We employ Segmental Conditional Random Fields and Linear Dynamical System to model musical performance by human. The combination of these models allows an efficient iterative decoding of score position and tempo. The combined advantages of our approach are the delayed-decision
Viterbi algorithm which utilizes future information to determine past score position with high reliability, thus improving alignment accuracy, and the fact that the future position can be anticipated using an adaptively estimated tempo. We also talk about interim progress of the research and some applications by using this

15 Nov 2016 - 10:09 | view
Seminar on music knowledge extraction using machine learning
4 Dec 2016

Taking advantage of the researchers coming to Barcelona for the NIPS conference, on December 4th we are organizing a small and informal seminar to discuss on various topics related to machine learning applied to music, putting special emphasis on the knowledge extraction aspects of it.

Full program:

10 Nov 2016 - 19:02 | view
Special ACM-TIST issue on Intelligent Music Systems

A recently issued special of the ACM Transactions on Intelligent Systems and Applications presents some of the recent research on "Intelligent Music Systems", a research area that crosses some of the goals of the GiantSteps EU project. This special issue has been co-edited by Markus Schedl (Johannes Kepler University, Linz, Austria), Yi-Hsuan Yang (Academia Sinica, Taipei, Taiwan), and Perfecto Herrera-Boyer (Universitat Pompeu Fabra & ESMUC, Barcelona, Spain).

It includes, in addition to five technical papers covering topics such as tagging, recommendation, segmentation, music video analysis or score-alignment, two guest articles that reflect on the open issues and neglected topics that could help to improve the "intelligence" of future systems for music analysis and creation.

It also includes three papers co-authored by MTG members:

Schedl, M., Yang Y., Herrera P. (2016).  Introduction to Intelligent Music Systems and Applications

Oramas, S., Ostuni V. C., Di Noia T., Serra X., Di Sciascio E. (2016).  Sound and Music Recommendation with Knowledge Graphs

Rodriguez-Serrano, F., Carabias-Orti, J., Vera-Candeas, P. & Martinez-Muñoz, D. (2016). Tempo Driven Audio-to-Score Alignment Using Spectral Decomposition and Online Dynamic Time Warping



8 Nov 2016 - 11:15 | view
Marie Curie PhD Fellowships for the MTG

Our Department is a hosting institution within the INPhINIT “la Caixa” Fellowships Programme (57 grants), and there are 4 proposals supervised by MTG researchers (see details below).


  • 3-year contract
  • €34,800 gross annual salary + €3,564 annual additional funding
  • Award of €7,500 for the PhD fellow in case he/she presented the thesis within a period of 3.5 years
  • Additional training in transversal skills: technology transfer, entrepreneurship, professional development.
  • Research stays in academia and industry.
  • Participation in networking and outreach activities.
  • Deadline for incorporation of candidates: September/October 2017..


Fellowship eligibility:

  • Be in the first four years (full-time equivalent research experience) of their research careers and not yet have been awarded a doctoral degree.
  • Not have resided or carried out their main activity (work, studies, etc.) in Spain for more than 12 months in the 3 years immediately prior to the recruitment date. Short stays such as holidays will not be taken into account.
  • Have a demonstrable level of English (B2 or higher).

PhD program admission:

  • Accredited undergraduate degree (Bachelor degree or recognised equivalent degree from an accredited Higher Education Institution).
  • Accredited graduate/master's degree (equivalent to a Spanish Master Universitario/Oficial, Master's of Research.... ) which enables them to access a Phd programme in their home country.
  • Total of 300 ECTS credits, 60 of those have to correspond to an official graduate, research oriented, master's programme.

Selection criteria


  • Academic records and CV (50%).
  • Motivation and goals declaration (30%): originality, innovation, impact and link with the selected Research Centre.
  • Recommendation letters (20%).


  • Potential (40%), motivation and impact (20%), CV (30%).

Important dates

  • Website open for applications: November 7th, 2016.
  • Application deadline: February 2nd, 2017.

How to apply
Applications are managed through the program website, More information about the procedure:

For additional information on the proposed topics please contact the PI!

7 Nov 2016 - 10:27 | view
CompMusic Seminar
18 Nov 2016
On November 18th 2016, Friday, from 10h to 18:30h in room 55.410 of the Communication Campus of the Universitat Pompeu Fabra in Barcelona, we will have a CompMusic seminar. This seminar accompanies the PhD thesis defenses of Ajay Srinivasamurthy and Sankalp Gulati, carried out in the context of the CompMusic project.
10:00 Simon Dixon (QMUL, London)
"Music Similarity and Cover Song Identification: The Case of Jazz"
Similarity in music is an evasive and subjective concept, yet computational models of similarity are cited as important for addressing tasks such as music recommendation and the management of music collections. Cover song (or version) identification deals with a specific case of music similarity, where the underlying musical work is the same, but its realisation is different in each version, usually involving different performers and differing arrangements of the music, which may vary in instrumentation, form, tempo, key, lyrics or in other aspects of rhythm, melody, harmony and timbre. The new version retains some features of the original recording, and it is usually assumed that the sequential pitch content (corresponding to melody and harmony) is preserved with limited alterations from the original version.
In music information retrieval, a standard approach to version identification uses predominant melody extraction to represent melodic content and chroma features to represent harmonic content. These features are adapted to allow for variation in key or tempo between versions, and a pairwise sequence matching algorithm computes the pairwise similarity between tracks, which can be used to estimate groups of cover songs. Different versions of a jazz standard can be regarded as a set of cover songs, but the identification of such covers is more complicated than for many other styles of music, due to the improvisatory nature of jazz, which allows ornamentation and transformation of the melody as well as substitution of chords in the harmony. We report on experiments on a set of 300 jazz standards using discrete-valued and continuous-valued measures of pairwise predictability between sequences, based on work with a former PhD student, Peter Foster.
11:00 Geoffroy Peeters (IRCAM, Paris)
"Recent researches at IRCAM related to the recognition of rhythm, vocal imitations and music structure"
In this talk, I will present some recent researches at IRCAM related to - the description of rhythm (especially the use of the Fourier-Mellin transform or of the Modulation Scale Transform with Auditory statistics) - the recognition of vocal imitations (using HMM decoding of SI-PLCA kernels over time) - the estimation of musical structure (using Convolutional Neural Networks).
12:00 Coffe break
12:30 Andre Holzapfel (KTH, Stockholm)
"Tracking time: State-of-the-art and open problems in meter inference"
Throughout the last years, significant progress was made in algorithmic approaches that aim at the recognition of metrical cycles, and the tracking of their structure in music audio signals. The automatic adaption to rhythmic patterns enabled to go beyond manually tailored tracking approaches, and deep learning based features increase the accuracy of the inference given an unknown audio signal. In principle, arbitrary time signatures can be recognized and tracked from a music recording, assuming the existence of a large enough representative dataset to learn from. In this talk a short summary of the state of the art will be provided, and open problems will be presented that represent potential subjects of future studies. These open problems comprise the tracking of metrical cycles of very long duration, the inclusion of modes beyond the acoustic signal, and a variety of subjects that arise within areas like performance studies, music theory, and ethnomusicology.
13:30 Lunch break
15:00 Barış Bozkurt (Koç University, Istanbul)
"Melodic analysis for Turkish makam music"
A makam generally implies a miscellany of rules for melodic composition, a design for melodic contour as a sequence of melodies (from specific categories) emphasising specific tones. This talk will start by presenting melody concepts in Turkish makam music and then continue discussing the methods, uses and  automatisation of melodic analysis for that music tradition. A study on culture-specific automatic melodic segmentation (of scores) will be presented. Finally we will discuss future perspectives for melodic analysis within the context of corpus-based study of makams.
16:00 Juan Pablo Bello (NYU, New York)
"Some Thoughts on the How, What and Why of Music Informatics Research"
The framework of music informatics research (MIR) can be thought of as a closed loop of data collection, algorithmic development and benchmarking. Much of what we do is heavily focused on the algorithmic aspects, or how to optimally combine various techniques from e.g., signal processing, data mining, and machine learning, to solve a variety of problems, from auto-tagging to automatic transcription, that captivate the interest of our community. We are very good at this, and in this talk I will describe some of the know-how that we have collectively accumulated over the years. On the other hand, I would argue that we are less proficient at clearly defining the “what” and “why” behind our work, that data collection and benchmarking have received far less attention and are often treated as afterthoughts, and that we sometimes tend to rely on widespread and limiting assumptions about music that affect the validity and usability of our research. On this, we can learn from other fields, such as music cognition, particularly with regards to the adoption of methods and practices that fully embrace the complexity and variability of human responses to music, while still clearly delineating the scope of the solutions or analyses being proposed.
17:00 Coffee break
17:30 Joan Serrà (Telefónica R+D, Barcelona)
"Facts and myths about deep learning"
Deep learning has revolutionized the traditional machine learning pipeline, with impressive results in domains such as computer vision, speech analysis, or natural language processing. The concept has gone beyond research/application environments, and permeated into the mass media, news blogs, job offers, startup investors, or big company executives' meetings. But what is behind deep learning? Why has it become so mainstream? What can we expect from it? In this talk, I will highlight a number of facts and myths that will provide a shallow answer to the previous questions. While doing that, I will also highlight a number of applications we have worked on at our lab. Overall, the talk wants to place a series of basic concepts, while giving ground for reflection or discussion on the topic.


18 Oct 2016 - 10:07 | view