The Speech Synthesis API allows text-to-speech functionality in Chrome Beta. However, results from TTS requests are automatically played by the browser. How do I access the audio results for post-processing and disable the default behavior of the API?
相关问题
- Is there a limit to how many levels you can nest i
- How to toggle on Order in ReactJS
- void before promise syntax
- Keeping track of variable instances
- Can php detect if javascript is on or not?
There is no standard audio output for the TTS system and that seems quite intentional so it is unlikely to change anytime soon.
To understand why, you can look at the other side of this interface where a browser extension can act as a TTS Engine and provide the voices the client can use:
Being a valid TTS Engine accessible by this API in chrome is about supporting starting/pausing/canceling and resuming of TTS requests and sending updates on the progress as events of the following types:
https://developer.chrome.com/extensions/tts#type-TtsEvent
As such, there is no standard way for a TTS engine to indicate the resulting audio aside from actually playing it. Depending on the specific TTS engine, it may not use a standard audio format or even the browser's normal audio devices access. (For example, it may be forwarding the text to the platform's accessibility system.)
If you know something about a specific TTS Engine (or create your own) then you can build your own interface1 to retrieve the audio file. But that TTS Engine must then be installed on every client's browser where you want to use it. This is why any solution must point you to a specific TTS Engine or an outside TTS solution if you want to control the playback beyond adjusting valid inputs to a TTS Engine request (relative pitch, relative volume, relative rate, sex.)
Notes-
1 If you give a TTS Engine such an interface, it can not trivially extend the existing TTS event API since the browser is checking them: