Improving accompanied Flamenco singing voice transcription by combining vocal detection and predominant melody extraction.

TitleImproving accompanied Flamenco singing voice transcription by combining vocal detection and predominant melody extraction.
Publication TypeConference Paper
Year of Publication2014
Conference NameInternational Computer Music Conference/Sound and Music Computing Conference
AuthorsKroher, N., & Gómez E.
Conference Start Date14/09/2014
Conference LocationAthens, Greece
AbstractWhile recent approaches to automatic voice melody transcription of accompanied flamenco singing give promising results regarding pitch accuracy, mistakenly transcribed guitar sections represent a major limitation for the obtained overall precision. With the aim of reducing the amount of false positives in the voicing detection, we propose a fundamental frequency contour estimation method which extends the pitch-salience based predominant melody extraction [3] with a vocal detection classifier based on timbre and pitch contour characteristics. Pitch contour segments estimated by the predominant melody extraction algorithm containing a high percentage of frames classified as non-vocal are rejected. After estimating the tuning frequency, the remaining pitch contour is segmented into single note events in an iterative approach. The resulting symbolic representations are evaluated with respect to manually corrected transcriptions on a frame-by-frame level. For two small flamenco dataset covering a variety of singers and audio quality, we observe a significant reduction of the voicing false alarm rate and an improved voicing F-Measure as well as an increased overall transcription accuracy. We furthermore demonstrate the advantage of vocal detection model trained on genre-specific material. The presented case study is limited to the transcription of Flamenco singing, but the general framework can be extended to other styles with genre-specific instrumentation.
intranet