Improving Emotional TTS with an Emotion Intensity Input from Unsupervised Extraction

This is the demo page for the paper "Improving Emotional TTS with an Emotion Intensity Input from Unsupervised Extraction" submitted to SSW'21. It is currently for review purpose only.

Examples from the listening test

	angry	sad	happy	fearful	surprised	happy	neutral
System	Samples
baseline
attention
transformer
rank
copy synth

UI of the listening test. 25 samples were randomly selected. Each one had to be rated on 5-scale MOS and in terms of perceived emotion at the same time.