Demo: Robust TTS Duration Modelling with DNNs

This companion page to paper [1] presents some randomly selected audio examples from our listening test. The stimuli illustrate the effects of conventional versus robust DNN-based duration prediction from found audiobook data (“Emma” by Jane Austen). Please read the paper for more information, including descriptions of the different systems and their properties.

Note: Should you experience problems with hearing the audio, please wait a while to allow the audio data to load (3.8 MiB). If playback still does not work, please try another web browser.

Audio examples

		System:
		VOC	FRC	BOT	MSE	MLE1	MLE3	B75	B50
Prompt ID:	184
	192
	198
	207

References

G. E. Henter, S. Ronanki, O. Watts, M. Wester, Z. Wu, and S. King, “Robust TTS duration modelling using DNNs,” Proc. ICASSP, 2016, pp. 5130–5134.
[ pdf | .bib | more info ]

[ return to main page | contact the author ]

Robust TTS Duration Modelling with DNNs:Audio Examples

Audio examples

References

Robust TTS Duration Modelling with DNNs:
Audio Examples