Singing Phoneme Class Detection In Polyphonic Music Recordings

Vagia, O.

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Singing Phoneme Class Detection In Polyphonic Music Recordings

Title	Singing Phoneme Class Detection In Polyphonic Music Recordings
Publication Type	Master Thesis
Year of Publication	2008
Authors	Vagia, O.
preprint/postprint document	files/publications/Ourania-Vaggia-Master-Thesis.pdf
Abstract	Automatic singing detection and singing phoneme recognition are two MIR research topics that have gained a lot of attention the last years. The first approaches borrowed successful techniques widely used in Automatic Speech Recognition (ASR) as speech and singing share similar acoustical features since they are produced by the same apparatus. Moving from monophonic to polyphonic audio signals the problem becomes more complex as the background instrumental accompaniment is regarded as a noise source that has to be attenuated. This thesis presents research into the problem of singing phoneme detection in polyphonic audio, in which the lyrics are in English. Specifically, we are interested in building statistical classification models that are able to automatically distinguish sung consonants and vowels from pure instrumental music in polyphonic music recordings. The approach begins with a database creation to be used for training, testing and evaluating the models. Several sets of extracted low-level features are used in the classification process. Different classification functions are compared like SVM, MLP and logistic as well as different classification schemes (3-class classifiers, binary classifiers in series and in parallel). The best classification model found reaches an overall accuracy of 78% in distinguishing between the 3 different classes.