I want to create a speech recognition script for the Raspberry Pi in Python and need an asynchronous/continuous speech recognition library. Asynchronous means that I need endless running of the recognition until the spoken matches to an array of words without any input from a keyboard, and then display the spoken to the terminal and restart recognition. I already had a look at PocketSphinx, but after a few hours Googling, I didn't find anything about an Asynchronous recognition with that.
Do you know any Library who is capable of that?
You can use Pocketsphinx on Raspberry Pi. You need to download latest version 5prealpha.
It can listen for multiple keyphrases. The code should be something like this:
import sys, os
from pocketsphinx import *
import pyaudio
modeldir = "../../../model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'en-us/en-us'))
config.set_string('-dict', os.path.join(modeldir, 'en-us/cmudict-en-us.dict'))
config.set_string('-kws', 'keyphrase.list')
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
# Process audio chunk by chunk. On keyword detected perform action and restart search
decoder = Decoder(config)
decoder.start_utt()
while True:
buf = stream.read(1024)
decoder.process_raw(buf, False, False)
if decoder.hyp() != None:
print "Detected keyword", decoder.hyp(), "restarting search"
decoder.end_utt()
decoder.start_utt()
The keypharse.list
file should look like this, one phrase per line with threshold
open the door /1e-40/
close the door /1e-40/
how are you /1e-30/
Thresholds must be tuned for every keyphrase to balance between false alarms and misdetections.
Well, you can change Jasper's name perhaps to something else. Perhaps, even an empty string.