In my Speech Engine I activate / desactivate multiple grammars.
In a special step I'd like to run a Grammar with ONLY to capture Audio of next given sentence according to engine's properties.
But to start/stop matching something, I assume engine needs "words". So I don't know how to do it ?
(Underlaying explanation: my application convert all garbage audio to text using google speech API because dictation is too bad and no available on Kinect)
Well, actually, no, the SR engine only needs to know that the incoming audio is "speech-like" (usually determined by the spectral characteristics of the audio). In particular, you could use the AudioPosition
property and the SpeechDetected
and RecognitionRejected
events to send all rejected audio to the google speech API.
So your workflow would look like this:
- Ask question of user.
- Enable appropriate grammars.
- Wait for recognition or recognition rejected.
- If recognition, process accordingly
- If recognition rejected, collect retained audio & send to google speech API.