mp3 recognition using Sphinx 4

2019-06-10 03:45发布

Can we use mp3 files for the voice recognition process without using wav files? or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy? The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy?

标签： mp3 speech-recognition cmusphinx sphinx4

1条回答

太酷不给撩

2楼-- · 2019-06-10 04:22

Can we use mp3 files for the voice recognition process without using wav files?

Not directly. To be able to recognize mp3 streams, you need to use java library to read mp3 and convert to pcm stream (tritonus-mp3, lameonj). You can also invoke ffmpeg as a separate process to decode.

or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy?

Accuracy is affected in both cases, no matter where you decode mp3 file.

The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy?

It's better to use losseless codec like flac for transfer. mp3 conversion degrades ASR accuracy. Another approach would be to calculate features on the client and transfer them to the server.

0人赞添加讨论(0) 举报

mp3 recognition using Sphinx 4

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间