Restricting speech recognition results on Android

2019-01-20 03:46发布

问题:

I'm making an app that allows people to speak and select between a few options (Strings). I'm having a little problem making the Android Speech Recognizer fit my idea.

Is there a way to just pass to the SpeechRecognizer the parameters that are "valid" and having it select between those the "best" match?

I don't need the code, I just need some guidance as my google-fu seems to be failing me today.

回答1:

Our solution to this problem is described at http://kaljurand.github.io/Grammars/, e.g. check out the paper linked from this page:

Kaarel Kaljurand, Tanel Alumäe. Controlled Natural Language in Speech Recognition Based User Interfaces (CNL 2012)

The basic idea is:

  1. don't use Google's speech recognizer because you cannot (currently) pass the language model (e.g. a grammar) to it (in our case it also didn't support the input language that we wanted to use);
  2. so you need to implement your own speech recognizer (e.g. based on Sphinx) and make it accept grammars as part of the input;
  3. implement the grammar. If it's a simple list of acceptable phrases then JSGF will do as the grammar description language, for more complex grammars I recommend Grammatical Framework (which you can automatically compile to JSGF or finite-state automata);
  4. implement an Android app that extends the RecognizerIntent API by adding a way to pass the grammar to the recognizer. You can base it e.g. on Kõnele.

All this might be an overkill in your case. Post-processing of Google's results (as @gregm suggests) is certainly easier to implement. But if you want to scale to more complex and/or multilingual language models then our approach certainly provides the required modularity and expressive power.



回答2:

No, there are no such parameters, google speech recognition is not flexible enough. You can use external speech recognition toolkit like CMUSphinx



回答3:

No, you cannot pass parameters that restrict the recognition or help it make the best match. You have to implement that yourself.

What you want to do is use some algorithms to help you match what Android's Speech recognizer returns with your desired options. This is especially important when your app has to recognize words that Android's recognizer cannot recognize, like Cumin.

For this you can use phonetic matching algorithms like the ones here

For some implementations and sample code on Android check out this open source project: GAST.