Run Pocketsphinx and Google TTS together

2019-06-02 10:12发布

问题:

I want to start a new activity that recognizes speech from the beginning and could read incoming message right after the activity started.

This code is the code that merged from default. It runs well in default condition. But I want to remove the button as the trigger and use voice instead to trigger further action in SMSReaderMain.java. Therefore, I use pocketsphinx for Android to make it possible.

pocketSphinxAndroidDemo-preAlpha project

Android text to Speech Tutorial project

It gives me no error, but when run it on actual device, it force closed. Log cat show these errors

 03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.330: I/cmusphinx(8505): INFO: kws_search.c(417): KWS(beam: -1080, plp: -23, default threshold -450)
03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.330: I/cmusphinx(8505): INFO: kws_search.c(417): KWS(beam: -1080, plp: -23, default threshold -450)
03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.330: E/cmusphinx(8505): ERROR: "kws_search.c", line 158: The word '/1e-60/' is missing in the dictionary
03-16 23:09:15.340: I/TextToSpeech(8505): Set up connection to ComponentInfo{com.svox.pico/com.svox.pico.PicoService}
03-16 23:09:15.340: I/TextToSpeech(8505): Set up connection to ComponentInfo{com.svox.pico/com.svox.pico.PicoService}

Please help me with example. Could you help me where is wrong? Here is my code:

SMSReaderMain.java

    package edu.cmu.pocketsphinx.demo;


import static android.widget.Toast.makeText;
import static edu.cmu.pocketsphinx.SpeechRecognizerSetup.defaultSetup;

import java.io.File;
import java.io.IOException;
import java.util.HashMap;

import android.app.Activity;
import edu.cmu.pocketsphinx.Assets;
import edu.cmu.pocketsphinx.Hypothesis;
import edu.cmu.pocketsphinx.RecognitionListener;
import edu.cmu.pocketsphinx.SpeechRecognizer;
import android.content.BroadcastReceiver;
import android.content.Context;
import android.content.Intent;
import android.content.IntentFilter;
import android.database.Cursor;
import android.net.Uri;
import android.os.AsyncTask;
import android.os.Bundle;
import android.provider.ContactsContract;
import android.provider.ContactsContract.PhoneLookup;
import android.speech.tts.TextToSpeech;
import android.telephony.SmsMessage;
import android.view.Gravity;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
//import android.widget.CompoundButton;
//import android.widget.CompoundButton.OnCheckedChangeListener;
import android.widget.TextView;
import android.widget.Toast;
//import android.widget.ToggleButton;



public class SMSReaderMain extends Activity implements RecognitionListener {    

    private final int CHECK_CODE = 0x1; 
    private final int LONG_DURATION = 5000;
    private final int SHORT_DURATION = 1200;

    private Speaker speaker;


    private TextView smsText;
    private TextView smsSender; 
    private BroadcastReceiver smsReceiver;

    public static final String TURNON_SR = "drive mode";
    public static final String TURNOFF_SR = "disable drive mode";
    public static final String DESTROY_SR = "exit drive mode";


    public SpeechRecognizer recognizer;
    public HashMap<String, Integer> captions;


    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        setContentView(R.layout.main_sms);  

        new AsyncTask<Void, Void, Exception>() {
            @Override
            protected Exception doInBackground(Void... params) {
                try {
                    Assets assets = new Assets(SMSReaderMain.this);
                    File assetDir = assets.syncAssets();
                    setupRecognizer(assetDir);
                } catch (IOException e) {
                    return e;
                }
                return null;
            }


        }.execute();


        //toggle = (ToggleButton)findViewById(R.id.speechToggle);
        smsText = (TextView)findViewById(R.id.sms_text);
        smsSender = (TextView)findViewById(R.id.sms_sender);

        startDriveMode();
        checkTTS();
        initializeSMSReceiver();
        registerSMSReceiver();

    }

    private void startDriveMode(){
        speaker = new Speaker(this);
        speaker.allow(true);
        speaker.speak(getString(R.string.start_speaking));
        //speaker.speak("Drive mode now will be enabled. I will read your new messages for you now");

    }

    private void checkTTS(){
        Intent check = new Intent();
        check.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
        startActivityForResult(check, CHECK_CODE);
    }

    @Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        if(requestCode == CHECK_CODE){
            if(resultCode == TextToSpeech.Engine.CHECK_VOICE_DATA_PASS){
                speaker = new Speaker(this);
            }else {
                Intent install = new Intent();
                install.setAction(TextToSpeech.Engine.ACTION_INSTALL_TTS_DATA);
                startActivity(install);
            }
        }
    }

    private void initializeSMSReceiver(){
        smsReceiver = new BroadcastReceiver(){
            @Override
            public void onReceive(Context context, Intent intent) {

                Bundle bundle = intent.getExtras();
                if(bundle!=null){
                    Object[] pdus = (Object[])bundle.get("pdus");
                    for(int i=0;i<pdus.length;i++){
                        byte[] pdu = (byte[])pdus[i];
                        SmsMessage message = SmsMessage.createFromPdu(pdu);
                        String text = message.getDisplayMessageBody();
                        String sender = getContactName(message.getOriginatingAddress());
                        speaker.pause(LONG_DURATION);
                        speaker.speak("You have a new message from" + sender + "!");
                        speaker.pause(SHORT_DURATION);
                        speaker.speak(text);
                        smsSender.setText("Message from " + sender);
                        smsText.setText(text);
                    }
                }

            }           
        };      
    }

    private void registerSMSReceiver() {    
        IntentFilter intentFilter = new IntentFilter("android.provider.Telephony.SMS_RECEIVED");
        registerReceiver(smsReceiver, intentFilter);
    }

    private String getContactName(String phone){
        Uri uri = Uri.withAppendedPath(PhoneLookup.CONTENT_FILTER_URI, Uri.encode(phone));
        String projection[] = new String[]{ContactsContract.Data.DISPLAY_NAME};
        Cursor cursor = getContentResolver().query(uri, projection, null, null, null);              
        if(cursor.moveToFirst()){
            return cursor.getString(0);
        }else {
            return "unknown number";
        }
    }

    @Override
    protected void onDestroy() {    
        super.onDestroy();
        unregisterReceiver(smsReceiver);
        speaker.destroy();
    }

    public void onPartialResult(Hypothesis hypothesis) {
        String text = hypothesis.getHypstr();
        try {
        Intent i= null;
        if (text.equals("drive mode")) {
            recognizer.cancel();
            popPicture();
            startDriveMode();   

        }
        if (text.equals("disable drive mode")) {
            speaker = new Speaker(this);
            speaker.speak(getString(R.string.stop_speaking));
            speaker.allow(false);   

            //popPicture2();
        }

        if (text.equals("exit drive mode")) {
            recognizer.cancel();
            i = new Intent(getApplicationContext(),PocketSphinxActivity.class);
            startActivity(i);
            onDestroy();
            //popPicture2();
        }

    } catch (Exception e) {
        e.printStackTrace();
    }
 }
    public void popPicture() {
        LayoutInflater inflater = getLayoutInflater();
        View layout = inflater.inflate(R.layout.toast_image,(ViewGroup) 
                findViewById(R.id.toast_layout_id));

        Toast toast = new Toast(getApplicationContext());
        toast.setGravity(Gravity.CENTER_HORIZONTAL, 0, 0);
        toast.setDuration(Toast.LENGTH_SHORT);
        toast.setView(layout);
        toast.show();
    }

    public void onResult(Hypothesis hypothesis) {
        ((TextView) findViewById(R.id.result_text)).setText("");
        if (hypothesis != null) {
            String text = hypothesis.getHypstr();
            makeText(getApplicationContext(), text, Toast.LENGTH_SHORT).show();
        }
    }

    public void switchSearch(String searchName) {
        recognizer.stop();
        recognizer.startListening(searchName);
        String caption = getResources().getString(captions.get(searchName));
        ((TextView) findViewById(R.id.caption_text)).setText(caption);
    }

    public void setupRecognizer(File assetsDir) {
        File modelsDir = new File(assetsDir, "models");
        recognizer = defaultSetup()
                .setAcousticModel(new File(modelsDir, "hmm/en-us-semi"))
                .setDictionary(new File(modelsDir, "dict/cmu07a.dic"))
                .setRawLogDir(assetsDir).setKeywordThreshold(1e-20f)
                .getRecognizer();
        recognizer.addListener(this);


        // Create grammar-based searches.
        recognizer.addKeywordSearch(TURNON_SR, new File(modelsDir, "keywords.list"));
        recognizer.addKeywordSearch(TURNOFF_SR, new File(modelsDir, "keywords.list"));
        recognizer.addKeywordSearch(DESTROY_SR, new File(modelsDir, "keywords.list"));

        //File menuGrammar = new File(modelsDir, "grammar/sms.gram");
       // recognizer.addGrammarSearch(TURNOFF_SR, menuGrammar);
        //recognizer.addGrammarSearch(TURNON_SR, menuGrammar);
        //recognizer.addGrammarSearch(DESTROY_SR, menuGrammar);
    }

    @Override
    public void onBeginningOfSpeech() {
        // TODO Auto-generated method stub

    }

    @Override
    public void onEndOfSpeech() {
        // TODO Auto-generated method stub

    }
}

回答1:

The error is that you didn't initialize speaker field on line 98, so it's value is null, so you have a null pointer exception. You need to initialize variable before using it.

You are also not using pocketsphinx correctly. You need to use pocketsphinx in keyword spotting mode with all three phrases you want to recognize.

You need to create a file keywords.list in assets/sync/model folder ith keyphrases like this:

drive mode /1e-40/
disable drive mode /1e-40/
exit drive mode /1e-40/

Numbers here are detection thresholds. You can try values like 1e-20 or 1e-60 and tune the threshold for best balance between detection and false alarm.

Then you need to set this list of phrases as keyphrase search, there is no need to use 3 grammar searches, you can run one search with 3 keyphrases:

recognizer.addKeywordSearch("commands", new File(modelDir, "keywords.list");

This way you will configure recognizer to look for 3 commands and ignore everything else.

Now you can process the results in onPartialResult method where detection are reported:

public void onPartialResult(Hypothesis hypothesis) {
    String text = hypothesis.getHypstr();
    if (text.equals("start drive mode") {
       recognizer.cancel(); // stop pocketsphinx
       startDriveMode(); // starting any other actions including speech recognition
    }
    ....
}

If you want to restart the search you can call recognizer.startSearch("commands");

public void startDriveMode() {
    // do whatever you want here, use Google TTS
    recognizer.startSearch("commands");
}


回答2:

Actually threshold frequancy changes word by word in pocketsphinx. I used it for always listening feature to find misplaced android phone. Hope this will help you:

https://github.com/manmaybarot/where-is-my-phone-Android-App

  • you can not use phone mic for different libs e.g tts and pocketsphinx. If you do then you might get error of "mic is already in use"

  • PocketSphinx is good for male voice but not good for recognising female voice.