Creating Corpora for Computational Research in Arab-Andalusian Music

Sordo, M.; Amin Chaachoo; Xavier Serra

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Creating Corpora for Computational Research in Arab-Andalusian Music

Title	Creating Corpora for Computational Research in Arab-Andalusian Music
Publication Type	Conference Paper
Year of Publication	2014
Conference Name	1st International Workshop on Digital Libraries for Musicology
Authors	Sordo, M. , Chaachoo A. , & Serra X.
Pagination	1-3
Conference Start Date	12/09/2014
Conference Location	London
Abstract	Research corpora are fundamental for the computational study of music. The design criteria with which to create them is a research task in itself. These corpora need to be well suited for the specific research problems to be addressed. Since these research problems are also shaped by musical, cultural and other specific aspects of the music traditions to be studied, the research corpora should take these specificities into account. In this paper we address the problems of creating corpora for computational research on Arab-Andalusian music, considering several relevant criteria for creating such corpora. We focus on the problems raised during the annotation process of the corpora, specifically the language issues surrounding this art music tradition. Following the criteria, we created a research corpus consisting of audio recordings with their corresponding metadata, lyrics and music scores. So far we have gathered 338 recordings from 3 different Arab-Andalusian music schools of Morocco, covering most of the musical modes, rhythms and forms of this art music tradition. The Arab-Andalusian corpus is accessible to the research community from a central online repository. Moreover, the audio recordings of this corpora are freely available through the Internet Archive repository. The Arab-Andalusian corpus can be used to generate test datasets, which can be used as ground truth to test several computational research tasks.
preprint/postprint document	http://hdl.handle.net/10230/35357
Final publication	http://doi.org/10.1145/2660168.2660182