Note:
This bibliographic page is archived and will no longer be updated.
For an up-to-date list of publications from the Music Technology Group see the
Publications list
.
Properly Using Speech Synthesis and Voice Transformation for Audiovisual Content Generation
Title | Properly Using Speech Synthesis and Voice Transformation for Audiovisual Content Generation |
Publication Type | Conference Paper |
Year of Publication | 2009 |
Conference Name | International Broadcasting Conference (IBC2009) |
Authors | Monzo, C. , Formiga L. , Adell J. , Mayor O. , Bonada J. , Janer J. , & Iriondo I. |
Conference Start Date | 10/09/2009 |
Publisher | IBC |
Conference Location | Amsterdam, The Netherlands |
Abstract | During the creation process, scriptwriters might want to quickly watch at the result of what they are creating. Text-to-Speech (TTS) systems offer the opportunity to deliver speech in a small amount of time. In addition, information might be dynamically generated by intelligent systems and TTS is crucial to deliver speech. The main drawback of the TTS utilization in audiovisual productions is that commercial systems offer few different voices. However, productions need a different voice for each involved character. Voice Transformation (VT) techniques can be used to overcome this limitation, allowing the user to personalize the voice for each character. In this paper, we will explain the technologies involved in TTS and VT systems and their combination in a nutshell. Finally, we present a study about the most efficient way to combine them: either convert the synthesized speech, or generate a new synthetic voice by converting the original speech database used in the TTS system. |