How to increase Google's Speech Recognition ac

2019-07-21 19:57发布

We give this image to our users:

enter image description here

This picture is representing separate numbers. And all of our users read it as "11-0-9-5" into their microphones.

We use Google Speech Engine, and it interprets this result:

"1109 5".

This makes it impossible for us to compare the spoken words with the expected result. And we're stuck in this phase.

Is there a way to tell Google's Speech Recognition to understand spoken numbers literally and separately, and do not join them together?

1条回答
我想做一个坏孩纸
2楼-- · 2019-07-21 20:26

You can try using speech context so that you constraint the GoogleSpeechEngine to stick to predefined numbers. https://cloud.google.com/speech-to-text/docs/reference/rest/v1/RecognitionConfig#SpeechContext

So if you specify 0,1,2,3,4,5,6,7,8,9,10,11 as possible phrases google should not send back 1109 as it is not in the context.

However using this method you have to list all possible values which can be tedious. Some cases won't be solved. For exemple if someone is ponouncing 11 as 1-1.

查看更多
登录 后发表回答