Managing text-to-speech and speech recognition at

2020-07-26 06:04发布

I'd like my iOS app to use text-to-speech to read to the user some information that it receives from a server, and I'd also like to allow the user to stop such speech by a voice command. I have tried speech recognition frameworks for iOS like OpenEars and I find the problem that it is listening and detecting the information the app itself is "saying" and it intereferes in the recognition of user's voice commands.

Has somebody dealt with this scenario in iOS and found a solution for that? Thanks in advance

标签： ios speech-recognition text-to-speech voice-recognition

1条回答

男人必须洒脱

2楼-- · 2020-07-26 06:41

It is not a trivial thing to implement. Unfortunately iOS and others record the sound which is playing through speaker. The only choice you have is to use the headset. In that case speech recognition can continue listening for input. In Openears recognition is disabled during TTS unless headset is plugged in.

If you still want to implement this feature which is called "barge-in" you have to do the following:

Store the audio you play though microphone
Implement noise cancellation algorithm which effectively will remove the audio from the recording. You can use cross-correlation to find a proper offset in the recording and spectral subtraction to remove the audio.
Recognize the speech in remaining signal.

It is not possible to do that without significant modification of openears sources.

0人赞添加讨论(0) 举报

Managing text-to-speech and speech recognition at

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间