News and Events
OVAL by OVAL SOUND
9 Mar 2017
Alex Posada, co-founder of Oval Sound, will give a talk on their music tech startup and the Oval, the first digital Handpan ever developed. The talk will take place on Thursday, March 9th 2017, at 12:00h in room 55.410 of the Communication Campus.
OVAL by OVAL SOUND
Founded in 2014 by Ravid Goldschmidt and Alex Posada in Barcelona (Spain), OVAL SOUND is the music technology startup behind Oval, the first digital Handpan ever developed.
The Oval is a new electronic musical instrument designed from the ground-up to revolutionize music creation, learning and live performance. This next generation instrument has sensitive multi-gesture sensors to combine the expressiveness of an acoustic instrument, with the flexibility of modern digital synthesis. To take full advantage of each subtle gesture, the Oval comes with its own multi-platform synthesis and effects app which when paired together, create a truly immersive playing experience.
The Oval app supports both iOS and Android, and comes with custom high- quality sound libraries ready to play. Or if you're feeling adventurous, you can connect the Oval to your favorite MIDI-enabled music apps, synths and even lighting rigs - the possibilities are endless.
The project was launched via a crowdfunding campaign on Kickstarter, and to this day remains one of the most successful campaigns on the platform in 2015.
Seminar by Geraint Wiggins on Computational Creativity
2 Mar 2017
Geraint A. Wiggins, from Queen Mary University of London, gives a seminar on "Creativity, deep symbolic learning, and the information dynamics of thinking" on Thursday, March 2nd 2017, at 15:30h in room 55.309 of the Communication Campus of the UPF.
Abstract: I present a hypothetical theory of cognition which is based on the principle that mind/brains are information processors and compressors, that are sensitive to certain measures of information content, as defined by Shannon (1948). The model is intended to help explicate processes of anticipatory and creative reasoning in humans and other higher animals. The model is motivated by the evolutionary value of prediction in information processing in an information-overloaded world.
YoMo: Presentation of TELMI project in the Mobile World Congress
On February 27th at 1PM we will be presenting TELMI project in the Youth Mobile Festival (YoMo) which is part of the GSMA Mobile World Congress. This festival is aimed at students from 10 to 16 years old, and it's expected to have 15.000 students visiting the festival.
The presentation will be focused on how interactive technologies can help in instrument learning, we will show some demos and the participants will be able to try some prototypes.
Find here more details of the activity:
Journal paper on the QUARTET dataset and the Repovizz system
Over the past few years, there has been an increasingly active discussion about publishing and accessing datasets for reuse in academic research. Although sometimes driven by concrete needs concerning a particular dataset or project, this topic is not accessory. In a data-driven research community like ours, it is very healthy to exchange ideas and perspectives on how to devise flexible means for making our data and results accessible--a valuable pursuit towards supporting research reproducibility.
The Music Technology Group of UPF hosts and provides free access to a number of datasets for music and audio research. As it normally happens with other published datasets, one needs to download the data files and procure means for exploring them locally. If limited to audio files or annotations, such process does not generally bring significant difficulties other than data volume. However, as the number and nature of modalities, extracted descriptors, and annotations increase (think of motion capture, video, physiological signals, time series of different sample rates, etc.), difficulties arise not only in the design or adoption of formatting schemes, but also in the availability of platforms that enable and facilitate exchange by providing simple ways to remotely visualize or explore the data before downloading.
In the context of several recent projects focused on music performance analysis and multimodal interaction, we had to collect, process, and annotate music performance recordings that included dense data streams of different modalities. Envisioning the future release of our dataset for the research community, we realized the need for better means to explore and exchange data. Since then, at UPF we have been developing Repovizz, a remote hosting platform for multimodal data storage, visualization, annotation, and selective retrieval via a web interface and a dedicated API.
By way of the recently published article E. Maestre, P. Papiotis, M. Marchini, Q. Llimona, O. Mayor, A. Pérez, M. Wanderley, Enriched Multimodal Representations of Music Performances: Online Access and Visualization IEEE MultiMedia, Vol 24:1, pp. 24-34, 2017, we introduce Repovizz to the MIR Community and open access to the QUARTET dataset, a fully annotated collection of string quartet multimodal recordings released through Repovizz.
For a short, unadorned video demonstrating Repovizz, please go to http://www.youtube.com/watch?v=JcHbGtltuG4. Although still under development, Repovizz can be used by anyone in the academic community.
The QUARTET dataset comprises 96 recordings of string quartet exercises involving solo and ensemble conditions, containing multichannel audio (ambient microphones and piezoelectric pickups), video, motion capture (optical and magnetic) of instrumental gestures and of musician upper bodies, computed bowing gesture signals, extracted audio descriptors, and multitrack score-performance alignment. The dataset, processed and curated over the past years partly in the context of the PhD dissertation work of Panos Papiotis on ensemble interdependence, is now freely available for the research community.
MUTEK'17: Artist in residence at MTG
As last years, we have stablished a collaboration with MUTEK festival to promote the use of our technologies within the artists community.
This year there is an open call for artists to create a work using technologies developed in the context of RAPID-MIX project.
The selected artist will be working at the MTG and will have the support from researchers during the residence. The final work will be presented as part of the program of MUTEK festival on March 9th at Mazda Space.
Open call: artist-in-residence mutek8
Journal paper on orchestral music source separation along with a new dataset
We are glad to announce the publishing of a journal paper on orchestral music source separation along with the PHENICX-Anechoic dataset. The methods were prototyped during the PHENICX project and were used for tasks as orchestra focus/instrument enhancement. To our knowledge, this is the first time source separation is objectively evaluated in such a complex scenario.
M. Miron, J. Carabias-Orti, J. J. Bosch, E. Gómez and J. Janer, "Score-informed source separation for multi-channel orchestral recordings", Journal of Electrical and Computer Engineering (2016))"
Abstract: This paper proposes a system for score-informed audio source separation for multichannel orchestral recordings. The orchestral music repertoire relies on the existence of scores. Thus, a reliable separation requires a good alignment of the score with the audio of the performance. To that extent, automatic score alignment methods are reliable when allowing a tolerance window around the actual onset and offset. Moreover, several factors increase the difficulty of our task: a high reverberant image, large ensembles having rich polyphony, and a large variety of instruments recorded within a distant-microphone setup. To solve these problems, we design context-specific methods such as the refinement of score-following output in order to obtain a more precise alignment. Moreover, we extend a close-microphone separation framework to deal with the distant-microphone orchestral recordings. Then, we propose the first open evaluation dataset in this musical context, including annotations of the notes played by multiple instruments from an orchestral ensemble. The evaluation aims at analyzing the interactions of important parts of the separation framework on the quality of separation. Results show that we are able to align the original score with the audio of the performance and separate the sources corresponding to the instrument sections.
The PHENICX-Anechoic dataset includes audio and annotations useful for tasks as score-informed source separation, score following, multi-pitch estimation, transcription or instrument detection, in the context of symphonic music. This dataset is based on the anechoic recordings described in this paper:
Pätynen, J., Pulkki, V., and Lokki, T., "Anechoic recording system for symphony orchestra," Acta Acustica united with Acustica, vol. 94, nr. 6, pp. 856-865, November/December 2008.
For more information about the dataset and how to download you can access the PHENICX-Anechoic web page.
SMC students winners in the Bulgaria Music Hackathon
Sound and Music Computing Master's students Manaswi Mishra, Kushagra Sharma and Siddharth Bhardwaj won the Music Vision category in the Bulgaria Music Hackathon with the project "Samsara - a virtual interactive soundscape", and got an opportunity of a sponsored visit to the monthly Music Hackathon in New York City, USA.
This project was a virtual interactive soundscape where multiple performers can create and manipulate music through their smartphones/touch interfaces. The virtual world is a physics based system where different types of virtual atoms interact with each other and the performer. Each performer has a role in the collaborative soundscape - a creator (sound generator atoms), a preserver (filters and fx atoms) and a destroyer (destroying the number of generators (creator atoms) in the system). Performers can create, delete and manipulate their type of atoms in the dynamic system of moving atoms and collisions to create a collaborative soundscape.
More information about the project can be found here: Presentation at Hackathon.
And a report about the event and interview on Live Bulgarian national TV: Eurocom TV interview.
Tenure Track position in Machine Learning at UPF
The Unit of Engineering in Information and Communication Technologies (ETIC) (https://www.upf.edu/etic/) at the Pompeu Fabra University in Barcelona is opening an academic position in Machine Learning.
ETIC has a highly interdisciplinary research environment with faculty interests covering a broad range of topics that can be broadly grouped into four main areas where data-driven research is transversal to all of them: (http://www.upf.edu/web/etic/research)
Multimedia Technologies, covering topics in image processing and cinematography, sound and music computing, human-computer Interaction, graphics and educational technologies and cognition as it relates to multimedia.
Computational Biology and Biomedical Systems including topics such as computational neuroscience, analysis of biomedical data, nonlinear signal analysis in biological systems, instrumentation and biomedical electronics, computational simulation and biomechanics, medical imaging and modeling of complex biomedical systems.
Computation and Intelligent Systems covering foundations of computer science, several aspects of artificial intelligence such as planning, natural language processing, computer vision, machine learning, ubiquitous computing, information retrieval and data mining, and human cognition and its relation to robotics.
Networking and Communications with topics related to both wired and wireless networks, network science, information theory and coding, as well as policy aspects and strategies related to networking technologies.
Currently ETIC is strengthening its research interests in these four areas with special interest in Machine Learning. We are looking for a junior scientist with strong research interests in any of these areas but with special focus in Machine Learning, and we seek applications for a Tenure-Track position. The appointed applicant will progressively lead a new research group, in close collaboration with the existing ones, being involved in the undergraduate and postgraduate teaching duties of the Department, in particular with the deployment of the new Degree in Mathematical Engineering of Data Science.
Candidates should hold a PhD degree and have an excellent scientific-technical background in Machine Learning, including (but not restricted to) topics such as Multimedia Technologies, Computational Biology and Biomedical Systems, Computation and Intelligent Systems or Networking and Communications. The candidate should also have interest in confronting and resolving scientific problems/challenges, and leading several research publications in major journals.
A motivation letter, curriculum vitae, list of publications, research plan and addresses of three putative referees should be sent as a single pdf to randp [dot] dtic [at] upf [dot] edu, before March 24th 2017.
Escolab 2017: presentation of 28x28 project
We will participate in Escolab 2017 presenting the project "28x28" developed by the artist Xavier Bonfill and the MTG researcher Frederic Font with the support of Phonos.
28x28 make use of Freesound and Essentia to develop a system that allows to create interactive real-time music compositions by playing a domino game.
The activity will be held in the Sala Polivalent where, after an introduction to the technical and artistic work behind the project, the participants will be able to try the installation and create their own music work.
This project shows the type of interaction between research and artistic creation promoted by the MTG.
Information: Escolab UPF
Gopala K. Koduri and Sertan Şentürk defend their PhD thesis
22 Feb 2017
Wednesday, February 22nd 2017 at 11:00h in room 55.309 (Tanger Building, UPF Communication Campus)
Gopala K. Koduri: “Towards a multimodal knowledge base for Indian art music: A case study with melodic intonation”
Thesis director: Xavier Serra
Thesis Committee: Anja Volk (Utrecht University), Baris Bozkurt (Koç University) and George Fazekas (QMUL)
Abstract: This thesis is a result of our research efforts in building a multi-modal knowledge-base for the specific case of Carnatic music. Besides making use of metadata and symbolic notations, we process natural language text and audio data to extract culturally relevant and musically meaningful information and structuring it with formal knowledge representations. This process broadly consists of two parts. In the first part, we analyze the audio recordings for intonation description of pitches used in the performances. We conduct a thorough survey and evaluation of the previously proposed pitch distribution based approaches on a common dataset, outlining their merits and limitations. We propose a new data model to describe pitches to overcome the shortcomings identified. This expands the perspective of the note model in-vogue to cater to the conceptualization of melodic space in Carnatic music. We put forward three different approaches to retrieve compact description of pitches used in a given recording employing our data model. We qualitatively evaluate our approaches comparing the representations of pitched obtained from our approach with those from a manually labeled dataset, showing that our data model and approaches have resulted in representations that are very similar to the latter. Further, in a raaga classification task on the largest Carnatic music dataset so far, two of our approaches are shown to outperform the state-of-the-art by a statistically significant margin.
In the second part, we develop knowledge representations for various concepts in Carnatic music, with a particular emphasis on the melodic framework. We discuss the limitations of the current semantic web technologies in expressing the order in sequential data that curtails the application of logical inference. We present our use of rule languages to overcome this limitation to a certain extent. We then use open information extraction systems to retrieve concepts, entities and their relationships from natural language text concerning Carnatic music. We evaluate these systems using the concepts and relations from knowledge representations we have developed, and groundtruth curated using Wikipedia data. Thematic domains like Carnatic music have limited volume of data available online. Considering that these systems are built for web-scale data where repetitions are taken advantage of, we compare their performances qualitatively and quantitatively, emphasizing characteristics desired for cases such as this. The retrieved concepts and entities are mapped to those in the metadata. In the final step, using the knowledge representations developed, we publish and integrate the information obtained from different modalities to a knowledge-base. On this resource, we demonstrate how linking information from different modalities allows us to deduce conclusions which otherwise would not have been possible.
Wednesday, February 22nd 2017 at 16:00h in room 55.309 (Tanger Building, UPF Communication Campus)
Sertan Şentürk: “Computational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music”
Thesis director: Xavier Serra
Thesis Committee: Gerhard Widmer (Johannes Kepler University), Baris Bozkurt (Koç University) and Tillman Weyde (City, University of London)
Abstract: This thesis addresses several shortcomings on the current state of the art methodologies in music information retrieval (MIR). In particular, it proposes several computational approaches to automatically analyze and describe music scores and audio recordings of Ottoman-Turkish makam music (OTMM). The main contributions of the thesis are the music corpus that has been created to carry out the research and the audio-score alignment methodology developed for the analysis of the corpus. In addition, several novel computational analysis methodologies are presented in the context of common MIR tasks of relevance for OTMM. Some example tasks are predominant melody extraction, tonic identification, tempo estimation, makam recognition, tuning analysis, structural analysis and melodic progression analysis. These methodologies become a part of a complete system called Dunya-makam for the exploration of large corpora of OTMM.
The thesis starts by presenting the created CompMusic Ottoman-Turkish makam music corpus. The corpus includes 2200 music scores, more than 6500 audio recordings, and accompanying metadata. The data has been collected, annotated and curated with the help of music experts. Using criteria such as completeness, coverage and quality, we validate the corpus and show its research potential. In fact, our corpus is the largest and most representative resource of OTMM that can be used for computational research. Several test datasets have also been created from the corpus to develop and evaluate the specific methodologies proposed for different computational tasks addressed in the thesis.
The part focusing on the analysis of music scores is centered on phrase and section level structural analysis. Phrase boundaries are automatically identified using an existing state-of-the-art segmentation methodology. Section boundaries are extracted using heuristics specific to the formatting of the music scores. Subsequently, a novel method based on graph analysis is used to establish similarities across these structural elements in terms of melody and lyrics, and to label the relations semiotically.
The audio analysis section of the thesis reviews the state-of-the-art for analysing the melodic aspects of performances of OTMM. It proposes adaptations of existing predominant melody extraction methods tailored to OTMM. It also presents improvements over pitch-distribution-based tonic identification and makam recognition methodologies.
The audio-score alignment methodology is the core of the thesis. It addresses the culture-specific challenges posed by the musical characteristics, music theory related representations and oral praxis of OTMM. Based on several techniques such as subsequence dynamic time warping, Hough transform and variable-length Markov models, the audio-score alignment methodology is designed to handle the structural differences between music scores and audio recordings. The method is robust to the presence of non-notated melodic expressions, tempo deviations within the music performances, and differences in tonic and tuning. The methodology utilizes the outputs of the score and audio analysis, and links the audio and the symbolic data. In addition, the alignment methodology is used to obtain score-informed description of audio recordings. The score-informed audio analysis not only simplifies the audio feature extraction steps that would require sophisticated audio processing approaches, but also substantially improves the performance compared with results obtained from the state-of-the-art methods solely relying on audio data.
The analysis methodologies presented in the thesis are applied to the CompMusic Ottoman-Turkish makam music corpus and integrated into a web application aimed at culture-aware music discovery. Some of the methodologies have already been applied to other music traditions such as Hindustani, Carnatic and Greek music. Following open research best practices, all the created data, software tools and analysis results are openly available. The methodologies, the tools and the corpus itself provide vast opportunities for future research in many fields such as music information retrieval, computational musicology and music education.