finding speed and tone of speech in an audio using

2019-08-23 09:31发布

Given an audio , I want to calculate the pace of the speech. i.e how fast or slow is it.

Currently I am doing the following:
- convert speech to text and obtaining a transcript (using a free tool).
- count number of words in transcript.
- calculate length or duration of file.
- finally, pace = (number of words in transcript / duration of file).

However the accuracy of the pace obtained is dependent purely on transcription , which I think is an unnecessary step.

Is there any python-library/sox/ffmpeg way that will enable me to

to calculate, in a straightforward way,the speed/pace of talk in an audio
dominant Pitches/tones of that audio?

I referred : I referred : http://sox.sourceforge.net/sox.html and https://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/

标签： python audio ffmpeg sox

1条回答

唯我独甜

2楼-- · 2019-08-23 10:09

Your method sounds interesting as a quick first-order approximation, but limited by the transcript resolution. You can analyze directly the audio file.

I'm not familiar with Sox, but from their manual seems like the stat option gives "... time and frequency domain statistical information about the audio"

Sox claims to be a "Swiss Army knife of audio manipulation", and just by skimming through their docs seems like it might suit you to find the general tempo.

If you want to run pitch analysis too, then you can develop your own algorithm with python - I recently used librosa and found it very useful and well documented.

0人赞添加讨论(0) 举报

finding speed and tone of speech in an audio using

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间