I have managed the "overview tutorial" : https://cloud.google.com/speech/docs/getting-started Then I tried to use my own audio file . I uploaded a .flac file with a sample rate of 16000Hz.
I only changed the sync-request.json
file below with my own audio file hosted on google cloud storage (gs://my-bucket/test4.flac
)
{
"config": {
"encoding":"flac",
"sample_rate": 16000
},
"audio": {
"uri":"gs://my-bucket/test4.flac"
}
}
The file is well recognized but the request return an "INVALID_ARGUMENT" error
{
"error": {
"code": 400,
"message": "Unable to recognize speech, code=-73541, possible error in recognition config. Please correct the config and retry the request.",
"status": "INVALID_ARGUMENT"
}
}
My bad, as the doc "https://cloud.google.com/speech/docs/basics", the .flac file have to be a 16-bit PCM
Sumup:
Encoding: FLAC
Channels: 1 @ 16-bit
Samplerate: 16000Hz
/!\ pay attention to not export a stereo file (2 channels) file which throw an other error (only one channel accepted) Google speech API internal server error -83104
As per this answer, all encodings support only 1 channel (mono) audio
I was creating the FLAC file with this command:
But adding the
-ac 1
(setting number of audio channels to 1) fixed this issue.Here is my full
Node.js
codeSample rate could be 16000 or 44100 or other valid ones, and encoding can be FLAC or LINEAR16. Cloud Speech Docs