Pitch Estimation of Choir Music using Deep Learning Strategies: from Solo to Unison Recordings

Cuesta, H

Note: This bibliographic page is archived and will no longer be updated. For an up-to-date list of publications from the Music Technology Group see the Publications list .

Pitch Estimation of Choir Music using Deep Learning Strategies: from Solo to Unison Recordings

Title	Pitch Estimation of Choir Music using Deep Learning Strategies: from Solo to Unison Recordings
Publication Type	Master Thesis
Year of Publication	2017
Authors	Cuesta, H.
Abstract	The goals of this thesis are the creation of new datasets to study aspects of choir singing, focusing on unison performances, and to research on data-driven methods for the automatic pitch estimation of a cappella choir singing performances. Choral music is polyphonic and involves multiple singers typically grouped into four main voices (soprano, alto, tenor and bass). The task of multi-pitch estimation becomes challenging due to the variety of acoustic scenarios (from solo singers to big choirs) and the lack of annotated datasets for training and evaluation, especially for the polyphonic case. In particular, we focus on building models of pitch from mono- phonic and unison recordings. In order to do that, we first build a dataset of choir singing that contains different types of performances: solo singers, unison, and four parts choir. Then, we train several deep learning architectures to extract pitch infor- mation from monophonic singing voice signals, and adapt them afterwards to model unison performances. The models for monophonic pitch estimation achieve state-of-the-art performances, and in some cases we outperform some of them, especially for the mid-frequency range. The model for unison choir is capable of predicting the average pitch and its dispersion of a unison performance with an average accuracy of 70%, although its accuracy and generalization capabilities are limited by the size of the dataset. The presented models provide a first step towards de automatic transcription of choir singing recordings, and the unison model is a useful resource for choir singing synthesis.
Final publication	https://doi.org/10.5281/zenodo.1108524