Change huge amount of data from NIST to RIFF wav f

2019-05-15 23:13发布

问题:

So, I am writing a speech recognition program. To do that I downloaded 400MB of data from TIMIT. When I inteded to read the wav files (I tried two libraries) as follow:

import scipy.io.wavfile as wavfile
import wave

(fs, x) = wavfile.read('../data/TIMIT/TRAIN/DR1/FCJF0/SA1.WAV')
w = wave.open('../data/TIMIT/TRAIN/DR1/FCJF0/SA1.WAV')

In both cases they have the problem that the wav file format says 'NIST' and it must be in 'RIFF' format. (Something about sph also I readed but the nist file I donwloaded are .wav, not .sph).

I downloaded then SOX from http://sox.sourceforge.net/ I added the path correctly to my enviromental variables so that my cmd recognize sox. But I can't really find how to use it correctly.

What I need now is a script or something to make sox change EVERY wav file format from NIST to RIFF under certain folder and subfolder.

EDIT: in reading a WAV file from TIMIT database in python I found a response that worked for me... Running sph2pipe -f wav input.wav output.wav What I need is a script or something that searches under a folder, all subfolders that contain a .wav file to apply that line of code.

回答1:

Since forfiles is a Windows command, here is a solution for unix. Just cd to the upper folder and type:

find . -name '*.WAV' | parallel -P20 sox {} '{.}.wav'

You need to have installed parallel and sox though, but for Mac you can get both via brew install. Hope this helps.



回答2:

Ok, I got it finally. Go to the upper folder and run this code:

forfiles /s /m *.wav /c "cmd /c sph2pipe -f wav @file @fnameRIFF.wav"

This code searches for every file and make it readble for the python libs. Hope it helps!



标签: audio wav sox