Computational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music

TitleComputational Analysis of Audio Recordings and Music Scores for the Description and Discovery of Ottoman-Turkish Makam Music
Publication TypePhD Thesis
Year of Publication2016
UniversityUniversitat Pompeu Fabra
AuthorsŞentürk, S.
AdvisorSerra, X.
Academic DepartmentDepartment of Information and Communication Technologies
Number of Pages357
Date Published12/2016
KeywordsArel-Ezgi-Uzdilek theory, audio recording, Audio Signal Processing, audio-score alignment, automatic description, automatic phrase segmentation, carnatic music, clique, composition identification, Cretan music, directed acyclic graphs, dynamic time warping, graph analysis, hindustani music, intonation, k nearest neighbor classification, k-means clustering, karar, kriti, Machine learning, makam, makam recognition, melodic progression, metadata, mode, music discovery, music information retrieval, music score, MusicBrainz, note alignment, ontology, open research, Ottoman-Turkish makam music, pitch-class distributions, predominant melody extraction, raag recognition, raga recognition, Reproducibility, research corpus, section linking, semiotic labeling, seyir, similarity, subsequence matching, tempo estimation, test dataset, Tonic, Tonic identification, Toolbox, transposition, tuning, variable-length Markov models, varnam
This thesis addresses several shortcomings of the current state of the art methodologies in music information retrieval (MIR). In particular, it proposes several computational approaches to automatically analyze and describe music scores and audio recordings of Ottoman-Turkish makam music (OTMM). The main contributions of the thesis are the music corpus that has been created to carry out the research and the audio-score alignment methodology developed for the analysis of the corpus. In addition, several novel computational analysis methodologies are presented in the context of common MIR tasks of relevance for OTMM. Some example tasks are predominant melody extraction, tonic identification, tempo estimation, makam recognition, tuning analysis, structural analysis and melodic progression analysis. These methodologies become a part of a complete system called Dunya-makam for the exploration of large corpora of OTMM.
The thesis starts by presenting the created CompMusic Ottoman-Turkish makam music corpus. The corpus includes 2200 music scores, more than 6500 audio recordings, and accompanying metadata. The data has been collected, annotated and curated with the help of music experts. Using criteria such as completeness, coverage and quality, we validate the corpus and show its research potential. In fact, our corpus is the largest and most representative resource of OTMM that can be used for computational research. Several test datasets have also been created from the corpus to develop and evaluate the specific methodologies proposed for different computational tasks addressed in the thesis.
The part focusing on the analysis of music scores is centered on phrase and section level structural analysis. Phrase boundaries are automatically identified using an existing state-of-the-art segmentation methodology. Section boundaries are extracted using heuristics specific to the formatting of the music scores. Subsequently, a novel method based on graph analysis is used to establish similarities across these structural elements in terms of melody and lyrics, and to label the relations semiotically. 
The audio analysis section of the thesis reviews the state-of-the-art for analyzing the melodic aspects of performances of OTMM. It proposes adaptations of existing predominant melody extraction methods tailored to OTMM. It also presents improvements over pitch-distribution-based tonic identification and makam recognition methodologies. 
The audio-score alignment methodology is the core of the thesis. It addresses the culture-specific challenges posed by the musical characteristics, music theory related representations and oral praxis of OTMM. Based on several techniques such as subsequence dynamic time warping, Hough transform and variable-length Markov models, the audio-score alignment methodology is designed to handle the structural differences between music scores and audio recordings. The method is robust to the presence of non-notated melodic expressions, tempo deviations within the music performances, and differences in tonic and tuning. The methodology utilizes the outputs of the score and audio analysis, and links the audio and the symbolic data. In addition, the alignment methodology is used to obtain a score-informed description of audio recordings. The score-informed audio analysis not only simplifies the audio feature extraction steps that would require sophisticated audio processing approaches but also substantially improves the performance compared with results obtained from the state-of-the-art methods solely relying on audio data.
The analysis methodologies presented in the thesis are applied to the CompMusic Ottoman-Turkish makam music corpus and integrated into a web application aimed at culture-aware music discovery. Some of the methodologies have already been applied to other music traditions such as Hindustani, Carnatic and Greek music. Following open research best practices, all the created data, software tools, and analysis results are openly available. The methodologies, the tools and the corpus itself provide vast opportunities for future research in many fields such as music information retrieval, computational musicology, and music education.
Final publication
Additional material: 

The companion webpage that contains data, code, results and other related resources is hosted in the CompMusic project website at:

A mirror of the companion page may be also accessed in Sertan Şentürk's personal website at: