Given an audio , I want to calculate the pace of the speech. i.e how fast or slow is it.
Currently I am doing the following:
- convert speech to text and obtaining a transcript (using a free tool).
- count number of words in transcript.
- calculate length or duration of file.
- finally, pace = (number of words in transcript / duration of file)
.
However the accuracy of the pace obtained is dependent purely on transcription , which I think is an unnecessary step.
Is there any python-library/sox/ffmpeg way that will enable me to
- to calculate, in a straightforward way,the speed/pace of talk in an audio
- dominant Pitches/tones of that audio?
I referred : I referred : http://sox.sourceforge.net/sox.html and https://digitalcardboard.com/blog/2009/08/25/the-sox-of-silence/