Speech to Text on Android [closed]

2019-01-10 21:50发布

问题:

I am looking to create an app which has Speech to text.

I am aware of this kind of ability using the RecognizerIntent: http://android-developers.blogspot.com/search/label/Speech%20Input

However - I do not want a new Intent to be popped up, I want to do the analysis a certain points in my current app, and I dont want it to pop something up stating that it is currently attempting to record your voice.

Has anybody any ideas on how best to do this. I was perhaps thinking of trying Sphinx 4 - but I dont know if this would be able to run on Android - has anyone got any advice or experience?!

I was wondering if I could alter the code here to perhaps not bothering to show the UI or button and just do the processing: http://developer.android.com/resources/samples/ApiDemos/src/com/example/android/apis/app/VoiceRecognition.html

Cheers,

回答1:

If you don't want to use the RecognizerIntent to do speech recognition, you could still use the SpeechRecognizer class to do it. However, using that class is a little bit more tricky than using the intent. As a final note, I would highly suggest to let the user know when he is recorded, otherwise he might be very set up, when he finally finds out.

Edit: A small example inspired (but changed) from this stack overflow entry

    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE,
            "com.domain.app");

    SpeechRecognizer recognizer = SpeechRecognizer
            .createSpeechRecognizer(this.getApplicationContext());
    RecognitionListener listener = new RecognitionListener() {
        @Override
        public void onResults(Bundle results) {
            ArrayList<String> voiceResults = results
                    .getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);
            if (voiceResults == null) {
                Log.e(TAG, "No voice results");
            } else {
                Log.d(TAG, "Printing matches: ");
                for (String match : voiceResults) {
                    Log.d(TAG, match);
                }
            }
        }

        @Override
        public void onReadyForSpeech(Bundle params) {
            Log.d(TAG, "Ready for speech");
        }

        @Override
        public void onError(int error) {
            Log.d(TAG,
                    "Error listening for speech: " + error);
        }

        @Override
        public void onBeginningOfSpeech() {
            Log.d(TAG, "Speech starting");
        }

        @Override
        public void onBufferReceived(byte[] buffer) {
            // TODO Auto-generated method stub

        }

        @Override
        public void onEndOfSpeech() {
            // TODO Auto-generated method stub

        }

        @Override
        public void onEvent(int eventType, Bundle params) {
            // TODO Auto-generated method stub

        }

        @Override
        public void onPartialResults(Bundle partialResults) {
            // TODO Auto-generated method stub

        }

        @Override
        public void onRmsChanged(float rmsdB) {
            // TODO Auto-generated method stub

        }
    };
    recognizer.setRecognitionListener(listener);
    recognizer.startListening(intent);

Important: Run this code from the UI Thread.



回答2:

What is built into Android (that you launch via the intent) is a client activity that captures your voice and sends the audio to a Google server for recognition. You could build something similar. You could host sphinx yourself (or use cloud recognition services like Yapme.com), capture the voice yourself, send the audio to a recognizer, and get back text results to your app. I don't know of a way to leverage the Google recognition services without use of the Intent on Android (or through Chrome).

The general consensus I've seen so far is that today's smartphones don't really have the horsepower to do Sphinx-like speech recognition. You may want to explore running a client recognizer for yourself, but Google uses server recognition.

For some related info see:

  • Google's voice search speech recognition service
  • Is it possible to use Android API outside an Android project?
  • Speech Recognition API


回答3:

In your activity do the following:

Image button buttonSpeak = findView....;// initialize it.
buttonSpeak.setOnClickListener(new View.OnClickListener() {

        @Override
        public void onClick(View v) {
            promptSpeechInput();
        }
    });



private void promptSpeechInput() {
    Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL,
            RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
    intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
    intent.putExtra(RecognizerIntent.EXTRA_PROMPT,
            getString(R.string.speech_prompt));
    try {
        startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
    } catch (ActivityNotFoundException a) {
        Toast.makeText(getApplicationContext(),
                getString(R.string.speech_not_supported),
                Toast.LENGTH_SHORT).show();
    }
}

    @Override
   protected void onActivityResult(int requestCode, int resultCode, Intent 
     data) {
    super.onActivityResult(requestCode, resultCode, data);

    switch (requestCode) {
        case REQ_CODE_SPEECH_INPUT: {
            if (resultCode == RESULT_OK && null != data) {

                result = data
                        .getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);

      EditText input ((EditText)findViewById(R.id.editTextTaskDescription));
      input.setText(result.get(0)); // set the input data to the editText alongside if want to.

            }
            break;
        }

    }
}