I'm failing to be able to play audio when making an "AJAX" request to my server side api.
I have backend Node.js code that's using IBM's Watson Text-to-Speech service to serve audio from text:
var render = function(request, response) {
var options = {
text: request.params.text,
voice: 'VoiceEnUsMichael',
accept: 'audio/ogg; codecs=opus'
};
synthesizeAndRender(options, request, response);
};
var synthesizeAndRender = function(options, request, response) {
var synthesizedSpeech = textToSpeech.synthesize(options);
synthesizedSpeech.on('response', function(eventResponse) {
if(request.params.text.download) {
var contentDisposition = 'attachment; filename=transcript.ogg';
eventResponse.headers['content-disposition'] = contentDisposition;
}
});
synthesizedSpeech.pipe(response);
};
I have client side code to handle that:
var xhr = new XMLHttpRequest(),
audioContext = new AudioContext(),
source = audioContext.createBufferSource();
module.controllers.TextToSpeechController = {
fetch: function() {
xhr.onload = function() {
var playAudio = function(buffer) {
source.buffer = buffer;
source.connect(audioContext.destination);
source.start(0);
};
// TODO: Handle properly (exiquio)
// NOTE: error is being received
var handleError = function(error) {
console.log('An audio decoding error occurred');
}
audioContext
.decodeAudioData(xhr.response, playAudio, handleError);
};
xhr.onerror = function() { console.log('An error occurred'); };
var urlBase = 'http://localhost:3001/api/v1/text_to_speech/';
var url = [
urlBase,
'test',
].join('');
xhr.open('GET', encodeURI(url), true);
xhr.setRequestHeader('x-access-token', Application.token);
xhr.responseType = 'arraybuffer';
xhr.send();
}
}
The backend returns the audio that I expect, but my success method, playAudio, is never called. Instead, handleError is always called and the error object is always null.
Could anyone explain what I'm doing wrong and how to correct this? It would be greatly appreciated.
Thanks.
NOTE: The string "test" in the URL becomes a text param on the backend and and ends up in the options variable in synthesizeAndRender.
Unfortunately, unlike Chrome's HTML5 Audio implementation, Chrome's Web Audio doesn't support audio/ogg;codecs=opus, which is what your request uses here. You need to set the format to
audio/wav
for this to work. To be sure it's passed through to the server request, I suggest putting it in the query string (accept=audio/wav
, urlencoded).Are you just looking to play the audio, or do you need access to the Web Audio API for audio transformation? If you just need to play the audio, I can show you how to easily play this with the HTML5 Audio API (not the Web Audio one). And with HTML5 Audio, you can stream it using the technique below, and you can use the optimal
audio/ogg;codecs=opus
format.It's as simple as dynamically setting the source of your audio element, queried from the DOM via something like this:
(in HTML)
(in your JS)
Your can also set the audio element's source via an XMLHttpRequest, but you won't get the streaming. But since you can use a POST method, you're not limited to the text length of a GET request (for this API, ~6KB). To set it in xhr, you create a data uri from a blob response:
As you can see, with XMLHttpRequest, you have to wait until the data are fully loaded to play. There may be a way to stream from XMLHttpRequest using the very new Media Source Extensions API, which is currently available only in Chrome and IE (no Firefox or Safari). This is an approach I'm currently experimenting with. I'll update here if I'm successful.