SpeechSynthesizer .NET control pitch

2020-07-03 09:09发布

问题:

I'm trying to change the pitch of spoken text via SSML and the .NET SpeechSynthesizer (System.Speech.Synthesis)

SpeechSynthesizer synthesizer = new SpeechSynthesizer();
PromptBuilder builder = new PromptBuilder();
builder.AppendSsml(@"C:\Users\me\Documents\ssml1.xml");
synthesizer.Speak(builder);

The content of the ssml1.xml file is:

<?xml version="1.0" encoding="ISO-8859-1"?>
<ssml:speak version="1.0"
xmlns:ssml="http://www.w3.org/2001/10/synthesis"
xml:lang="en-US">
<ssml:sentence>
Your order for <ssml:prosody pitch="+30%" rate="-90%" >8 books</ssml:prosody>
will be shipped tomorrow.
</ssml:sentence>
</ssml:speak>

The rate is recognized: "8 books" is speaken much slower than the rest, but no matter what value is set for "pitch", it makes no difference ! Allowed values can be found here:

http://www.w3.org/TR/speech-synthesis/#S3.2.4

Am I missing something or is changing the pitch just not supported by the Microsoft Speech engine ?

fritz

回答1:

While the engine SsmlParser used by System.Speech accepts a pitch attribute in the ProcessProsody method, it does not process it.

It only processes the range, rate, volume and duration attributes. It also parses contour but is processed as range (not sure why)...

Edit: if you don't really need to read the text from a SSML xml file, you can create the text programatically.
Instead of

builder.AppendSsml(@"C:\Users\me\Documents\ssml1.xml");

use

builder.Culture = CultureInfo.CreateSpecificCulture("en-US");
builder.StartVoice(builder.Culture);
builder.StartSentence();

builder.AppendText("Your order for ");

builder.StartStyle(new PromptStyle() { Emphasis = PromptEmphasis.Strong, Rate = PromptRate.ExtraSlow });
builder.AppendText("8 books");
builder.EndStyle();

builder.AppendText(" will be shipped tomorrow.");

builder.EndSentence();
builder.EndVoice();