How to chop wav file into 10ms data

2019-09-14 23:38发布

问题:

I am trying to divide the data I retrieve from a wav into 10ms segments for dynamic time warping.

    import wave
    import contextlib

    data = np.zeros((1, 7000))
    rate, wav_data = wavfile.read(file_path)
    with contextlib.closing(wave.open(file_path, 'r')) as f:
        frames = f.getnframes()
        rate = f.getframerate()
        duration = frames / float(rate)

Is there any existing library that do that

Thanks

回答1:

If you're interested in post-processing the data, you'll probably be working with it as numpy data.

>>> import wave
>>> import numpy as np
>>> f = wave.open('911.wav', 'r')
>>> data = f.readframes(f.getnframes())
>>> data[:10]  # just to show it is a string of bytes
'"5AMj\x88\x97\xa6\xc0\xc9'
>>> numeric_data = np.fromstring(data, dtype=np.uint8)
>>> numeric_data
array([ 34,  53,  65, ..., 128, 128, 128], dtype=uint8)
>>> 10e-3*f.getframerate()  # how many frames per 10ms?
110.25

That's not an integer number, so unless you're going to interpolate your data, you'll need to pad your data with zeros to get nice 110 frames long samples (which are about 10ms at this framerate).

>>> numeric_data.shape, f.getnframes()  # there are just as many samples in the numpy array as there were frames
((186816,), 186816)
>>> padding_length = 110 - numeric_data.shape[0]%110 
>>> padded = np.hstack((numeric_data, np.zeros(padding_length)))
>>> segments = padded.reshape(-1, 110)
>>> segments
array([[  34.,   53.,   65., ...,  216.,  222.,  228.],
       [ 230.,  227.,  224., ...,   72.,   61.,   45.],
       [  34.,   33.,   32., ...,  147.,  158.,  176.],
       ..., 
       [ 128.,  128.,  128., ...,  128.,  128.,  128.],
       [ 127.,  128.,  128., ...,  128.,  129.,  129.],
       [ 129.,  129.,  128., ...,    0.,    0.,    0.]])
>>> segments.shape
(1699, 110)

So now, every row of the segments array is about 10ms long.