What does the data returned by scipy.io.wavfile.re

2019-07-18 00:31发布

The documentation of scipy.io.wavfile.read says that it returns sample rate and data. But what does data actually mean here in case of .wav files?

Can anyone let me know in layman terms how that data is prepared?

PS. I read somewhere that it means amplitude? Is what I read correct? If yes, how is that amplitude calculated and returned by scipy.io.wavfile.read?

标签: python scipy wav
1条回答
贪生不怕死
2楼-- · 2019-07-18 01:02

scipy.io.wavfile.read is a convenience wrapper to decompose the .wav file into a header and the data contained in the file.

From the source code

Returns
-------
rate : int
    Sample rate of wav file.
data : numpy array
    Data read from wav file.  Data-type is determined from the file;
    see Notes.

Simplified code from the source:

fid = open(filename, 'rb')
try:
    file_size, is_big_endian = _read_riff_chunk(fid) # find out how to read the file
    channels = 1 # assume 1 channel and 8 bit depth if there is no format chunk
    bit_depth = 8
    while fid.tell() < file_size: #read the file a couple of bytes at a time
        # read the next chunk
        chunk_id = fid.read(4)

        if chunk_id == b'fmt ':  # retrieve formatting information
            fmt_chunk = _read_fmt_chunk(fid, is_big_endian)
            format_tag, channels, fs = fmt_chunk[1:4]
            bit_depth = fmt_chunk[6]
            if bit_depth not in (8, 16, 32, 64, 96, 128):
                raise ValueError("Unsupported bit depth: the wav file "
                                 "has {}-bit data.".format(bit_depth))
        elif chunk_id == b'data':
            data = _read_data_chunk(fid, format_tag, channels, bit_depth,is_big_endian, mmap)

finally:
    if not hasattr(filename, 'read'):
        fid.close()
    else:
        fid.seek(0)

return fs, data

The data itself is usually PCM represented sound pressure levels in successive frames for the different channels. The sampling rate returned by scipy.io.wavfile.read is necessary to determine how many frames represent a second.

A good explanation of the .wav format is offered by this question.

scipy doesn't calculate much on its own.

查看更多
登录 后发表回答