Pitch Contour Segmentation for Computer-aided Jingju Singing Training

TitlePitch Contour Segmentation for Computer-aided Jingju Singing Training
Publication TypeConference Paper
Year of Publication2016
Conference Name13th Sound and Music Computing Conference (SMC 2016)
AuthorsGong, R., Yang Y., & Serra X.
Conference Start Date31/08/2016
Conference LocationHamburg, Germany
KeywordsBeijing opera, jingju, pitch contour, segmentation, singing training
AbstractImitation is the main approach of jingju (also known as Beijing opera) singing training through its inheritance of nearly 200 years. Students learn singing by receiving auditory and gestural feedback cues. The aim of computer-aided training is to visually reveal the student’s intonation problem by representing the pitch contour on segment-level. In this paper, we propose a technique for this purpose. Pitch contour of each musical note is segmented automatically by a melodic transcription algorithm incorporated with a genre-specific musicological model of jingju singing: bigram note transition probabilities defining the probabilities of a transition from one note to another. A finer segmentation which takes into account the high variability of steady segments in jingju context enables us to analyze the subtle details of the intonation by subdividing the note’s pitch contour into a chain of three basic vocal expression segments: steady, transitory and vibrato. The evaluation suggests that this technique outperforms the state of the art methods for jingju singing. The web prototype implementation of these techniques offers a great potential for both in-class learning and self-learning.
Additional material: 


The code used in this paper is in "code" folder, where you can find:

  1. melodic_transcription: code for estimating the bigram note transition probabilities from the jingju singing scores (dataset) and evaluating the performance of melodic transcription.
  2. pitch_contour_segmentation: the python code for pitch contour segmentation without the "preliminary segmentation" step (which is already performed by using pyin_noteTransition).
  3. pyin_noteTransition: the modified pYIN algorithm code incorporated with the jingju bigram note transition probabilities + the binary for Mac OS X.

The correct step to reproduce the results is: 1) install pyinBOBigram Vamp plugin in pyin_noteTransition, 2) evaluate the performance of melodic transcription, 3) evaluate the pitch contour segmentation.


The dataset used in this paper is in "dataset" folder. The a cappella singing audio recordings is not contained in this folder due to their large size, please contact the paper authors to request them (rong [dot] gong [at] upf [dot] edu). In the folder you can find:

  1. groundtruth: the ground truth annotation for

    • melodic transcription (male_12_pos_1 missing)
    • parameter optimization,
    • evaluating the StdCdLe thresholding and the overall segmentation performance.
  2. jinging singing scores in .xml format used for estimating the bigram note transition probabilities.

Supplementary Information

In the folder "supplementary_information", you can find:

the complete grid search result for optimizing "StdCdLe threshold" and other parameters.