How to chop wav file into 10ms data

2019-09-14 23:48发布

I am trying to divide the data I retrieve from a wav into 10ms segments for dynamic time warping.

    import wave
    import contextlib

    data = np.zeros((1, 7000))
    rate, wav_data = wavfile.read(file_path)
    with contextlib.closing(wave.open(file_path, 'r')) as f:
        frames = f.getnframes()
        rate = f.getframerate()
        duration = frames / float(rate)

Is there any existing library that do that

Thanks

1条回答
淡お忘
2楼-- · 2019-09-14 23:59

If you're interested in post-processing the data, you'll probably be working with it as numpy data.

>>> import wave
>>> import numpy as np
>>> f = wave.open('911.wav', 'r')
>>> data = f.readframes(f.getnframes())
>>> data[:10]  # just to show it is a string of bytes
'"5AMj\x88\x97\xa6\xc0\xc9'
>>> numeric_data = np.fromstring(data, dtype=np.uint8)
>>> numeric_data
array([ 34,  53,  65, ..., 128, 128, 128], dtype=uint8)
>>> 10e-3*f.getframerate()  # how many frames per 10ms?
110.25

That's not an integer number, so unless you're going to interpolate your data, you'll need to pad your data with zeros to get nice 110 frames long samples (which are about 10ms at this framerate).

>>> numeric_data.shape, f.getnframes()  # there are just as many samples in the numpy array as there were frames
((186816,), 186816)
>>> padding_length = 110 - numeric_data.shape[0]%110 
>>> padded = np.hstack((numeric_data, np.zeros(padding_length)))
>>> segments = padded.reshape(-1, 110)
>>> segments
array([[  34.,   53.,   65., ...,  216.,  222.,  228.],
       [ 230.,  227.,  224., ...,   72.,   61.,   45.],
       [  34.,   33.,   32., ...,  147.,  158.,  176.],
       ..., 
       [ 128.,  128.,  128., ...,  128.,  128.,  128.],
       [ 127.,  128.,  128., ...,  128.,  129.,  129.],
       [ 129.,  129.,  128., ...,    0.,    0.,    0.]])
>>> segments.shape
(1699, 110)

So now, every row of the segments array is about 10ms long.

查看更多
登录 后发表回答