What is the difference between these two methods in C# using the speech API or SAPI?
using SpeechLib;
SpVoice speech = new SpVoice();
speech.Speak(text, SpeechVoiceSpeakFlags.SVSFlagsAsync);
returns the Apacela voices, and
SpeechSynthesizer ss = new SpeechSynthesizer();
ss.SpeakAsync ("Hello, world");
Does not work with Apacela voices.
The first one return all voices but the second one only return few voices. Is this something related to SAPI 5.1 and SAPI 5.3?
The behavior is same on Vista and XP, on both SpVoice was able to detect the Apacela voice but using SpeechSynthesizer, the voices does not detected on both XP and Vista.
I guess XP uses SAPI 5.1, and Vista uses SAPI 5.3 then why the same behavior on all OS, but different behavior with the API?
Also which API is more powerful and what are the difference between the two ways/API?
SpeechLib is an Interop DLL that makes use of classic COM-based SAPI under the covers. System.Speech was developed by Microsoft to interact with Text-to-speech (and voice recognition) directly from within managed code.
In general, it's cleaner to stick with the managed library (System.Speech) when you're writing a managed application.
It's definitely not related to SAPI version--the most likely problem here is that a voice vendor (in this case Acapela) has to explicitly implement support for certain System.Speech features. It's possible that the Acapela voices that you have support everything that is required, but it's also possible that they don't. Your best bet would be to ask the Acapela Group directly.
Voices are registered in HKLM\SOFTWARE\Microsoft\Speech\Tokens, and you should see the Windows built-in voices, as well as the Acapela voices that you have added listed there. If you spot any obvious differences in how they're registered, you might be able to make the Acapela voices work by making their registration match that of, for example, MS-Anna.
But I'd say the most likely possibility is that the Acapela voices have not been updated to support all of the interfaces required by System.Speech.
SpeechLib is an interop DLL and so maps to whatever version of SpeechLib it was created for (you can check it's properties).
System.Speech.* is the "official" support for speech in the .NET framework. SpeechSynthesizer chooses which speech library to use at runtime (much like the System.Web.Mail classes did).
I'm not sure why they return a different number of voices but it is likely to be related to the SAPI version being used.