On the Use of Note Onsets for Improved Lyrics-to-audio Alignment in Turkish Makam Music

TitleOn the Use of Note Onsets for Improved Lyrics-to-audio Alignment in Turkish Makam Music
Publication TypeConference Paper
Year of Publication2016
Conference Name17th International Society for Music Information Retrieval Conference (ISMIR 2016)
AuthorsDzhambazov, G., Srinivasamurthy A., Şentürk S., & Serra X.
Conference Start Date05/08/2016
Conference LocationNew York (USA)
AbstractLyrics-to-audio alignment aims to automatically match given lyrics and musical audio. In this work we extend a state of the art approach for lyrics-to-audio alignment with information about note onsets. In particular, we consider the fact that transition to next lyrics syllable usually implies transition to a new musical note. To this end we formulate rules that guide the transition between consecutive phonemes when a note onset is present. These rules are incorporated into the transition matrix of a variable-time hidden Markov model (VTHMM) phonetic recognizer based on MFCCs. An estimated melodic contour is input to an automatic note transcription algorithm, from which the note onsets are derived. The proposed approach is evaluated on 12 a cappella audio recordings of Turkish Makam music using a phrase-level accuracy measure. Evaluation of the alignment is also presented on a polyphonic version of the dataset in order to assess how degradation in the extracted onsets affects performance. Results show that the proposed model outperforms a baseline approach unaware of onset transition rules. To the best of our knowledge, this is the one of the first approaches tackling lyrics tracking, which combines timbral features with a melodic feature in the alignment process itself.
Additional material: 

code: https://github.com/georgid/AlignmentDuration/tree/noteOnsets

a cappella dataset:
http://compmusic.upf.edu/turkish-makam-acapella-sections-dataset

for part of the dataset onsets are annotated. They are in the branch vocal-only:
http://compmusic.upf.edu/node/233

(oracle on 4 recordings: see commit  https://github.com/MTG/otmm_audio_score_alignment_dataset/commit/4eff4e9...)

FUTURE: annotate onsets on more recordings


poster:
https://drive.google.com/file/d/0B4bIMgQlCAuqNTlYYWZBNldsdTQ/view?usp=sh...


FUTURE: Concatenate TextGrid annotation files for the segmented a cappella files for easier evaluation. Now it is done for first 6 recordings.
https://github.com/georgid/AlignmentDuration/issues/32

intranet