Score-Informed Syllable Segmentation for Jingju a Cappella Singing Voice with Mel-Frequency Intensity Profiles

Rong Gong; Nicolas Obin; Georgi Dzhambazov; Xavier Serra

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Score-Informed Syllable Segmentation for Jingju a Cappella Singing Voice with Mel-Frequency Intensity Profiles

Title	Score-Informed Syllable Segmentation for Jingju a Cappella Singing Voice with Mel-Frequency Intensity Profiles
Publication Type	Conference Paper
Year of Publication	2017
Conference Name	International Workshop on Folk Music Analysis
Authors	Gong, R. , Obin N. , Dzhambazov G. , & Serra X.
Pagination	107-113
Conference Start Date	14/06/2017
Conference Location	Malaga, Spain
Abstract	This paper introduces a new unsupervised and score-informed method for the segmentation of singing voice into syllables. The main idea of the proposed method is to detect the syllable onset on a probability density function by incorporating a priori syllable duration derived from the score. Firstly, intensity profiles are used to exploit the characteristics of singing voice depending on the Mel-frequency regions. Then, the syllable onset probability density function is obtained by selecting candidates over the intensity profiles and weighted for the purpose of emphasizing the onset regions. Finally, the syllable duration distribution shaped by the score is incorporated into Viterbi decoding to determine the optimal sequence of onset time positions. The proposed method outperforms conventional methods for the segmentation of syllable on a jingju (also known as Peking or Beijing opera) a cappella dataset. An analysis is conducted on precision errors to provide direction for future improvement.
preprint/postprint document	https://doi.org/10.5281/zenodo.556820

Additional material:

Jingju a cappella singing dataset: http://doi.org/10.5281/zenodo.345490

Presentation slides: https://doi.org/10.5281/zenodo.556820