I am editing my question to reflect the issue I am having in my application.
I am trying to take a streamed audio and convert it to text using Google text to speech. Then pass that that text as input to a conversation not on Watson. Watson then returns its answer.
The latter half works great.
The issue I am having is that I can't get the script to pass the text from the recorded speech to the Watson service I created.
I don't get an error, I just get nothing. The mic is working (I tested it with another script). The program actually indicates I could understand my response (no text I assume). Here is my code
import os
import watson_developer_cloud
import speech_recognition as sr
from gtts import gTTS
import watson_developer_cloud
import time
# Set up Assistant service.
service = watson_developer_cloud.AssistantV1(
#username = 'USERNAME', # replace with service username
#password = 'PASSWORD', # replace with service password
iam_api_key = 'xxxxxxxxxx', # replace with service username
url = 'xxxxxxxxxx', # replace with service password
version = 'xxxxxxxxxx'
)
workspace_id = 'xxxxxxxxxxxxxx' # replace with workspace ID
def getaudiodevices():
devices = os.popen("arecord -l")
device_string = devices.read()
device_string = device_string.split("\n")
for line in device_string:
if line.find("card") != -1:
print("hw:" + line[line.find("card") + 5] + "," + line[line.find("device") + 7])
def speak(audiostring):
print(audiostring)
tts = gTTS(text=audiostring, lang='en')
tts.save('audio.mp3')
os.system('mpg321 audio.mp3')
def recordaudio():
# Record Audio
r = sr.Recognizer()
with sr.Microphone(0) as source:
print("Say something!")
audio = r.listen(source,phrase_time_limit=10)
# Speech recognition ******
data = " "
try:
data = r.recognize_google(audio)
print("You said: " + data)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
return data
# Initialize with empty value to start the conversation.
user_input = ''
context = {}
current_action = ''
# Main input/output loop
while current_action != 'end_conversation':
# Send message to Assistant service.
response = service.message(
workspace_id = workspace_id,
input = {
'text': user_input
},
context = context
)
# Print the output from dialog, if any.
if response['output']['text']:
print(response['output']['text'][0])
speak(response['output']['text'][0])
# Update the stored context with the latest received from the dialog.
context = response['context']
# Check for action flags sent by the dialog.
if 'action' in response['output']:
current_action = response['output']['action']
# User asked what time it is, so we output the local system time.
if current_action == 'display_time':
print('The current time is ' + time.strftime('%I:%M:%S %p') + '.')
speak('The current time is ' + time.strftime('%I:%M:%S %p') + '.')
# If we're not done, prompt for next round of input.
if current_action != 'end_conversation':
user_input = input('>> ')
Note: currently I can write the speech from keyboard and it works. I want the user input to come from the text generated from the transcribes audio using Google Text to speech. I need to get the data from the recorded audio into the main part of my python script where it is communicating with the Watson service.
Can someone help me figure this out?