Can we use mp3 files for the voice recognition process without using wav files? or can we generate a wav file from a mp3 and then do the voice recognition without a serious impact on the accuracy? The problem is I need to minimize the load transferred through the network in my application. Will the information which is lost in the conversion be a huge factor for accuracy?
相关问题
- Can we recover audio from MFCC coefficients?
- Is it possible to know the duration of an MP3 befo
- Draw waveform from MP3 stream in C# on WinRT
- Speech recognition not working well
- Web speech API grammar
相关文章
- How to embed Google Speech to Text API in Python p
- How can I get the BPM property of an MP3 file in a
- c++ mp3 library [closed]
- Error Domain=kAFAssistantErrorDomain Code=209 “(nu
- portaudio.h: No such file or directory
- How to simultaneously read audio samples while rec
- Android: Arabic speech recognition - offline
- Force Download MP3 with PHP
Not directly. To be able to recognize mp3 streams, you need to use java library to read mp3 and convert to pcm stream (tritonus-mp3, lameonj). You can also invoke ffmpeg as a separate process to decode.
Accuracy is affected in both cases, no matter where you decode mp3 file.
It's better to use losseless codec like flac for transfer. mp3 conversion degrades ASR accuracy. Another approach would be to calculate features on the client and transfer them to the server.