I want to use the new MS Speech Translation API, but I am working with Go so there is no SDK. I have a WebSockets implementation for the previous Translator Speech API, so raw WebSocket are no issue.
The documentation states that it is using WebSockets, but I was unable to find the endpoints in the documentation. Does anyone know what are the WS endpoints and their path/header parameters?
EDIT:
The documentation also says: "If you already have code that uses Bing Speech or Translator Speech via WebSockets, you can update it to use the Speech service. The WebSocket protocols are compatible, only the endpoints are different." But the new endpoints are missing.
After digging into the binaries of client SDKs I have found the Speech Translate API to be wss://<REGION>.s2s.speech.microsoft.com/speech/translation/cognitiveservices/v1
Another problem is that the WebSocket protocol is NOT compatible despite the documentation says so. Good thing is that after experiments I have found out that the new Speech Translation WS API uses the same protocol as the old Bing Speech WS API, except for URL query parameters. The Bing Speech API has a language
parameter and the Speech Translate preview API has from
, to
, voice
and features
. The from
and to
work as expected, you can even send more languages in to
(comma separated and the TTS is missing). I have not tried the voice
. The features
looks like doing nothing and there are always partial results, timing info and TTS.
The responses are also different, but similar to Bing Speech. They have headers and there are multiple different JSONs. Just observe the raw strings.
As this is a preview API it can change at any time.
There hasn't been substantial changes in the Websocket protocol, so the old documentation should be reasonable accurate.
The Microsoft Cognitive Services Speech SDK doesn't support GO yet, it is on the roadmap, but will not happen this calendar year.
thx
Wolfgang