-->

Stop pocketsphinx recognizer for voice feedback

2019-06-01 20:35发布

问题:

Still with the same project, this is a continuation from Run pocketSphinx and Google TTS together. I already do the revision according to the guide from Nikolay Shymyrev and do a lot of helping. But the final feature that I want implement still remains. The Google TTS run just fine now, but the recognizer have some problem.

the recognizer won't start if the Google TTS some words that quite long like

Speaker.speak("Drive mode now will be enabled. I will read your new messages for you now.");

and then my onPartialResult if condition cannot fulfilled like

if (text.equals("exit")) {
        speaker.speak("yeah get in");
        recognizer.cancel();
        Intent i = new Intent(getApplicationContext(),PocketSphinxActivity.class);
        startActivity(i);

I think the recognizer always listen since it runs in background, and then it listen the google TTS sentence that caused it won't recognize my speech afterwards. Because when I use handsfree with mic, and the sentence for speaker.Speak is just "Drive mode enabled", it recognize well my word next and execute the if condition above when I say "exit". But when the sentence is quite long like "Drive mode now will be enabled, I will read bla bla bla" it won't listen to my "exit" word.

What I want to do now is add timeout to the recognizer to timeout several momment so that it wont recognize any unnecessary sound. I want to put

startRecognition("search", timeout)

but my Eclipse won't me let do that. It gives me error. I'm Using PocketSphinx for Android 5 pre alpha.

Here's again, my code that build just to test and make sure it recognize just "exit" words

SMSReaderMain.java

public class SMSReaderMain extends Activity implements RecognitionListener {    

    private final int CHECK_CODE = 0x1; 
    private final int LONG_DURATION = 5000;
    private final int SHORT_DURATION = 1200;

    private Speaker speaker;


    private TextView smsText;
    private TextView smsSender; 
    private BroadcastReceiver smsReceiver;


    public static final String TURNON_SR = "drive mode";
    public static final String TURNOFF_SR = "ok";
    public static final String DESTROY_SR = "exit";


    public SpeechRecognizer recognizer;
    public HashMap<String, Integer> captions;




    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        setContentView(R.layout.main_sms);  
        captions = new HashMap<String, Integer>();


        new AsyncTask<Void, Void, Exception>() {
            @Override
            protected Exception doInBackground(Void... params) {
                try {
                    Assets assets = new Assets(SMSReaderMain.this);
                    File assetDir = assets.syncAssets();
                    setupRecognizer(assetDir);
                } catch (IOException e) {
                    return e;
                }
                return null;
            }   
            @Override
            protected void onPostExecute(Exception result) {
                if (result != null) {
                    ((TextView) findViewById(R.id.caption_text))
                            .setText("Failed to init recognizer " + result);
                } else {
                    switchSearch(TURNOFF_SR);
                }
            }
        }.execute();

        //toggle = (ToggleButton)findViewById(R.id.speechToggle);
        smsText = (TextView)findViewById(R.id.sms_text);
        smsSender = (TextView)findViewById(R.id.sms_sender);

        checkTTS();
    }

    private void startDriveMode(){
        speaker.allow(true);
        //speaker.speak(getString(R.string.start_speaking));
        speaker.speak("Drive mode enabled");


    }

    private void checkTTS(){
        Intent check = new Intent();
        check.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
        startActivityForResult(check, CHECK_CODE);
    }

    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        if(requestCode == CHECK_CODE){
            if(resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS)
            {
                speaker = new Speaker(this);
            }else {
                Intent install = new Intent();
                install.setAction(TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
                startActivity(install);
            }
        }

        startDriveMode();
        initializeSMSReceiver();
        registerSMSReceiver();

    }


    private void initializeSMSReceiver(){
        smsReceiver = new BroadcastReceiver(){
            @Override
            public void onReceive(Context context, Intent intent) {

                Bundle bundle = intent.getExtras();
                if(bundle!=null){
                    Object[] pdus = (Object[])bundle.get("pdus");
                    for(int i=0;i<pdus.length;i++){
                        byte[] pdu = (byte[])pdus[i];
                        SmsMessage message = SmsMessage.createFromPdu(pdu);
                        String text = message.getDisplayMessageBody();
                        String sender = getContactName(message.getOriginatingAddress());
                        speaker.pause(LONG_DURATION);
                        speaker.speak("You have a new message from" + sender + "!");
                        speaker.pause(SHORT_DURATION);
                        speaker.speak(text);
                        smsSender.setText("Message from " + sender);
                        smsText.setText(text);
                    }
                }

            }           
        };      
    }

    private void registerSMSReceiver() {    
        IntentFilter intentFilter = new IntentFilter("android.provider.Telephony.SMS_RECEIVED");
        registerReceiver(smsReceiver, intentFilter);
    }

    private String getContactName(String phone){
        Uri uri = Uri.withAppendedPath(PhoneLookup.CONTENT_FILTER_URI, Uri.encode(phone));
        String projection[] = new String[]{ContactsContract.Data.DISPLAY_NAME};
        Cursor cursor = getContentResolver().query(uri, projection, null, null, null);              
        if(cursor.moveToFirst()){
            return cursor.getString(0);
        }else {
            return "unknown number";
        }
    }

    @Override
    protected void onDestroy() {    
        super.onDestroy();
        unregisterReceiver(smsReceiver);
        speaker.destroy();
    }

    public void onPartialResult(Hypothesis hypothesis) {
        //System.out.println("masuk coiii");
        String text = hypothesis.getHypstr();
        try {
        if (text.equals("exit")) {
            speaker.speak("yeah get in");
            recognizer.cancel();
            Intent i = new Intent(getApplicationContext(),PocketSphinxActivity.class);
            startActivity(i);

        }


        //Intent i= null;
        /**if (text.equals(TURNON_SR)) {
            recognizer.cancel();
            popPicture();
            startDriveMode();   

        }
        if (text.equals(TURNOFF_SR)) {
            //speaker = new Speaker(this);
            speaker.speak(getString(R.string.stop_speaking));
            speaker.allow(false);   

            //popPicture2();
        }

        if (text.equals(DESTROY_SR)) {
            recognizer.cancel();
            i = new Intent(getApplicationContext(),PocketSphinxActivity.class);
            startActivity(i);
            onDestroy();
            //popPicture2();
        } **/

    } catch (Exception e) {
        e.printStackTrace();
    }
 }
    public void popPicture() {
        LayoutInflater inflater = getLayoutInflater();
        View layout = inflater.inflate(R.layout.toast_image,(ViewGroup) 
                findViewById(R.id.toast_layout_id));

        Toast toast = new Toast(getApplicationContext());
        toast.setGravity(Gravity.CENTER_HORIZONTAL, 0, 0);
        toast.setDuration(Toast.LENGTH_SHORT);
        toast.setView(layout);
        toast.show();
    }

    public void onResult(Hypothesis hypothesis) {
        ((TextView) findViewById(R.id.result_text)).setText("");
        if (hypothesis != null) {
            String text = hypothesis.getHypstr();
            makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();
        }
    }

    public void switchSearch(String searchName) {
        recognizer.stop();
        recognizer.startListening(searchName);
        //taro timout disini biar mic ga denger suara hp sendiri

        ((TextView) findViewById(R.id.caption_text)).setText(searchName);
    }

    public void setupRecognizer(File assetsDir) {

        File modelsDir = new File(assetsDir, "models");
        recognizer = defaultSetup()
                .setAcousticModel(new File(modelsDir, "hmm/en-us-semi"))
                .setDictionary(new File(modelsDir, "dict/cmu07a.dic"))
                .setRawLogDir(assetsDir).setKeywordThreshold(1e-10f)
                .getRecognizer();
        recognizer.addListener(this);



        // Create grammar-based searches.
        // recognizer.addKeyphraseSearch(TURNOFF_SR, TURNON_SR);
        //recognizer.addGrammarSearch(TURNON_SR, new File(modelsDir, "grammar/sms.gram"));
        //recognizer.addGrammarSearch(TURNOFF_SR, new File(modelsDir, "grammar/sms.gram"));
        //recognizer.addGrammarSearch(DESTROY_SR, new File(modelsDir, "grammar/sms.gram"));

        File menuGrammar = new File(modelsDir, "grammar/sms.gram");
        recognizer.addGrammarSearch(TURNOFF_SR, menuGrammar);
        //recognizer.addGrammarSearch(TURNON_SR, menuGrammar);
        //recognizer.addGrammarSearch(DESTROY_SR, menuGrammar);

    }

    @Override
    public void onBeginningOfSpeech() {
        // TODO Auto-generated method stub

    }

    @Override
    public void onEndOfSpeech() {
        // TODO Auto-generated method stub

    }
}

Speaker.java

public class Speaker implements OnInitListener {

    private TextToSpeech tts;

    private boolean ready = false;
    private boolean prematureSpeak = false;
    private String ps;

    private boolean allowed = false;

    public Speaker(Context context){
        tts = new TextToSpeech(context, this);      
    }   

    public boolean isAllowed(){
        return allowed;
    }

    //public void allow(boolean allowed){
    public void allow(boolean allowed){
        this.allowed = allowed;
    }

    @Override
    public void onInit(int status) {
        if(status == TextToSpeech.SUCCESS){
            // Change this to match your
            // locale
            tts.setLanguage(Locale.US);
            ready = true;
            if (prematureSpeak)
            {
                speak(ps);
                prematureSpeak = false;
            }
        }else{
            ready = false;
        }
    }


    public void speak(String text){

        // Speak only if the TTS is ready
        // and the user has allowed speech

        if(ready && allowed) {
            HashMap<String, String> hash = new HashMap<String,String>();
            hash.put(TextToSpeech.Engine.KEY_PARAM_STREAM, 
                    String.valueOf(AudioManager.STREAM_NOTIFICATION));
            tts.speak(text, TextToSpeech.QUEUE_ADD, hash);
        }
        else if(!ready) {
            prematureSpeak = true;
            ps = text;
        }
    }

    public void pause(int duration){
        tts.playSilence(duration, TextToSpeech.QUEUE_ADD, null);
    }

    // Free up resources
    public void destroy(){
        tts.shutdown();
    }

    public boolean isSpeaking()
    {
        return tts.isSpeaking();
    }

}

回答1:

Your code has several issues:

1) I told you to use keyword spotting mode, you are still using grammar mode

2) You need to cancel recognizer before you start voice feedback, instead of first speak then cancel

 if (text.equals("exit")) {
    speaker.speak("yeah get in");
    recognizer.cancel();
    ....

you need to first cancel then speak:

 if (text.equals("exit")) {
    recognizer.cancel();
    speaker.speak("yeah get in");
    ....

3) Once speaker is over you need to restart the recognizer, but there is no need to run activity again, see for details How to know when TTS is finished?

With those changes in onUtteranceEnded you start recognizer again:

public void onUtteranceCompleted(String utteranceId) {
   recognizer.startSearch("search name");
}

Do not restart recognizer in onPartialResult, wait till TTS will finish.