On the Use of Note Onsets for Improved Lyrics-to-audio Alignment in Turkish Makam Music

Georgi Dzhambazov; Ajay Srinivasamurthy; Sertan Şentürk; Xavier Serra

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

On the Use of Note Onsets for Improved Lyrics-to-audio Alignment in Turkish Makam Music

Title	On the Use of Note Onsets for Improved Lyrics-to-audio Alignment in Turkish Makam Music
Publication Type	Conference Paper
Year of Publication	2016
Conference Name	17th International Society for Music Information Retrieval Conference (ISMIR 2016)
Authors	Dzhambazov, G. , Srinivasamurthy A. , Şentürk S. , & Serra X.
Pagination	716-722
Conference Start Date	05/08/2016
Conference Location	New York (USA)
Abstract	Lyrics-to-audio alignment aims to automatically match given lyrics and musical audio. In this work we extend a state of the art approach for lyrics-to-audio alignment with information about note onsets. In particular, we consider the fact that transition to next lyrics syllable usually implies transition to a new musical note. To this end we formulate rules that guide the transition between consecutive phonemes when a note onset is present. These rules are incorporated into the transition matrix of a variable-time hidden Markov model (VTHMM) phonetic recognizer based on MFCCs. An estimated melodic contour is input to an automatic note transcription algorithm, from which the note onsets are derived. The proposed approach is evaluated on 12 a cappella audio recordings of Turkish Makam music using a phrase-level accuracy measure. Evaluation of the alignment is also presented on a polyphonic version of the dataset in order to assess how degradation in the extracted onsets affects performance. Results show that the proposed model outperforms a baseline approach unaware of onset transition rules. To the best of our knowledge, this is the one of the first approaches tackling lyrics tracking, which combines timbral features with a melodic feature in the alignment process itself.
preprint/postprint document	http://hdl.handle.net/10230/33116

Additional material:

code : https://github.com/georgid/AlignmentDuration/tree/noteOnsets

a cappella dataset:
http://compmusic.upf.edu/turkish-makam-acapella-sections-dataset

for part of the dataset onsets are annotated. They are in the branch vocal-only:
http://compmusic.upf.edu/node/233

(oracle on 4 recordings: see commit https://github.com/MTG/otmm_audio_score_alignment_dataset/commit/4eff4e9... )

FUTURE: annotate onsets on more recordings

poster:
https://drive.google.com/file/d/0B4bIMgQlCAuqNTlYYWZBNldsdTQ/view?usp=sh...

FUTURE: Concatenate TextGrid annotation files for the segmented a cappella files for easier evaluation. Now it is done for first 6 recordings.
https://github.com/georgid/AlignmentDuration/issues/32