We are trying to do acoustic training but we are unable to create the transcripted audio files, how to create it? Also we are using GetTranscript and Appendtranscript but we are unable to get the ISpTranscript interface for the ISpStream if we open the stream in READWRITE mode, so how do you create the transcript wav files.
hr = SPBindToFile(L"e:\\file1.wav", SPFM_OPEN_READONLY,
&cpStream);
hr = cpStream.QueryInterface(&cpTranscript);
// We get a error here for as E_NONINTERFACE if SPFM_OPEN_READWRITE
hr = cpTranscript->AppendTranscript(sCorrectText);
hr = cpTranscript->GetTranscript(&pwszTranscript);
// GIVES CORRECT TRANSCRIPT
//READING THIS AGAIN ON NEXT EXECUTION TIME DOES NOT GIVE THE TRANSCRIPT
hr = SPBindToFile(L"e:\\file1.wav", SPFM_OPEN_READONLY,
&cpStream);
hr = cpStream.QueryInterface(&cpTranscript);
//THIS GIVE THE ERROR E_NONINTERFACE
After doing this we need to add the file path to the registry. We are doing this by the following code.
CComPtr<ISpObjectToken> cpObjToken;
ULONG CSIDL_LOCAL_APPDATA = 28;
ULONG CSIDL_FLAG_CREATE = 32768;
GUID guid0;
LPWSTR FileName2;
hr = cpRecognizerBase->GetRecoProfile(&cpObjToken);
hr = CoCreateGuid(&guid0);
hr = cpObjToken->GetStorageFileName(guid0, L"Test", L"F:\\sample6.wav",CSIDL_FLAG_CREATE, &FileName2);
//this code runs fine but the file is never added to the registry
Any pointers will be appreciated. This question is in reference with the question asked here Speech training files and registry locations
Thanks
The
E_NONINTERFACE
happens if the ISPStream has no contents. For example the file was empty; the call didn't succeed but still returneds_OK
(it does this for some reason). So normally I would investigate if the stream actually has any contents first. You can do this by checking its size:Here is an example. If it has a size of 0 or some absurdly large size then obviously it hasn't returned a correct value. Bear in mind the returned value is a
ULARGE_INTEGER
.SPBindToFile only works with
SPFM_OPEN_READONLY
andSPFM_CREATE_ALWAYS
, so you will have to use one of those.As for how to make the appended transcript save, it seems that you cannot save it directly if the wav file already exists (or at least I don't know how). If the file doesn't exist yet, you can create a new ispstream and when you pass audio information to it for example by voice or microphone (there are plenty of examples on the web), you can append a transcript then and it will stick. I include an example below.
Appending a transcript onto a new file:
Bill Hutchinson (one of the linked sources below) has some code that can be used to perform recognizer training with out all the registry edits and so on. I have included it at the end of this post. He has a function (TrainOne) which trains the recognizer file by file, via memory stream. You can pass preexisting WAVs to this. Specifically either WAVs with transcripts, or WAVs with out transcripts and (then provide the transcript to the function at call time). Please take a look at it as it is very informative.
Here is a collection of all knowledge related to SAPI that I have found, that will be useful for others trying to figure this mess out. I will also post my own complete SAPI training solution soon:
How to use the function GetStorageFileName for adding training files to registry?
Acoustic training using SAPI 5.3 Speech API
Training sapi : Creating transcripted wav files and adding file paths to registry
https://groups.google.com/forum/#!topic/microsoft.public.speech_tech.sdk/fTq-PJrVd_Q
https://documentation.help/SAPI-5/documentation.pdf
Sample training code:
Since Bill Hutchinson's SAPI code is one of the few reliable examples of how to use SAPI for training on the web, I have included his post from google below, in case it is one day deleted/lost: