-->

Python - downsampling wav audio file

2019-02-03 19:38发布

问题:

I have to downsample a wav file from 44100Hz to 16000Hz without using any external python libraries, so preferably wave and/or audioop. I tried just changing the wav files framerate to 16000 by using setframerate function but that just slows down the entire recording. How can I just downsample the audio file to 16kHz and maintain the same length of the audio?

Thank you very much in advance

回答1:

You can use Librosa's load() function,

import librosa    
y, s = librosa.load('test.wav', sr=8000) # Downsample 44.1kHz to 8kHz

The extra effort to install Librosa is probably worth the peace of mind.

Pro-tip: when installing Librosa on Anaconda, you need to install ffmpeg as well, so

pip install librosa
conda install -c conda-forge ffmpeg

This saves you the NoBackendError() error.



回答2:

Thank you all for your answers. I found a solution already and it works very nice. Here is the whole function.

def downsampleWav(src, dst, inrate=44100, outrate=16000, inchannels=2, outchannels=1):
    if not os.path.exists(src):
        print 'Source not found!'
        return False

    if not os.path.exists(os.path.dirname(dst)):
        os.makedirs(os.path.dirname(dst))

    try:
        s_read = wave.open(src, 'r')
        s_write = wave.open(dst, 'w')
    except:
        print 'Failed to open files!'
        return False

    n_frames = s_read.getnframes()
    data = s_read.readframes(n_frames)

    try:
        converted = audioop.ratecv(data, 2, inchannels, inrate, outrate, None)
        if outchannels == 1:
            converted = audioop.tomono(converted[0], 2, 1, 0)
    except:
        print 'Failed to downsample wav'
        return False

    try:
        s_write.setparams((outchannels, 2, outrate, 0, 'NONE', 'Uncompressed'))
        s_write.writeframes(converted)
    except:
        print 'Failed to write wav'
        return False

    try:
        s_read.close()
        s_write.close()
    except:
        print 'Failed to close wav files'
        return False

    return True


回答3:

You can use resample in scipy. It's a bit of a headache to do, because there's some type conversion to be done between the bytestring native to python and the arrays needed in scipy. There's another headache, because in the wave module in Python, there is no way to tell if the data is signed or not (only if it's 8 or 16 bits). It might (should) work for both, but I haven't tested it.

Here's a small program which converts (unsigned) 8 and 16 bits mono from 44.1 to 16. If you have stereo, or use other formats, it shouldn't be that difficult to adapt. Edit the input/output names at the start of the code. Never got around to use the command line arguments.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
#  downsample.py
#  
#  Copyright 2015 John Coppens <john@jcoppens.com>
#  
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#  
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#  
#  You should have received a copy of the GNU General Public License
#  along with this program; if not, write to the Free Software
#  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
#  MA 02110-1301, USA.
#  
#

inwave = "sine_44k.wav"
outwave = "sine_16k.wav"

import wave
import numpy as np
import scipy.signal as sps

class DownSample():
    def __init__(self):
        self.in_rate = 44100.0
        self.out_rate = 16000.0

    def open_file(self, fname):
        try:
            self.in_wav = wave.open(fname)
        except:
            print("Cannot open wav file (%s)" % fname)
            return False

        if self.in_wav.getframerate() != self.in_rate:
            print("Frame rate is not %d (it's %d)" % \
                  (self.in_rate, self.in_wav.getframerate()))
            return False

        self.in_nframes = self.in_wav.getnframes()
        print("Frames: %d" % self.in_wav.getnframes())

        if self.in_wav.getsampwidth() == 1:
            self.nptype = np.uint8
        elif self.in_wav.getsampwidth() == 2:
            self.nptype = np.uint16

        return True

    def resample(self, fname):
        self.out_wav = wave.open(fname, "w")
        self.out_wav.setframerate(self.out_rate)
        self.out_wav.setnchannels(self.in_wav.getnchannels())
        self.out_wav.setsampwidth (self.in_wav.getsampwidth())
        self.out_wav.setnframes(1)

        print("Nr output channels: %d" % self.out_wav.getnchannels())

        audio = self.in_wav.readframes(self.in_nframes)
        nroutsamples = round(len(audio) * self.out_rate/self.in_rate)
        print("Nr output samples: %d" %  nroutsamples)

        audio_out = sps.resample(np.fromstring(audio, self.nptype), nroutsamples)
        audio_out = audio_out.astype(self.nptype)

        self.out_wav.writeframes(audio_out.copy(order='C'))

        self.out_wav.close()

def main():
    ds = DownSample()
    if not ds.open_file(inwave): return 1
    ds.resample(outwave)
    return 0

if __name__ == '__main__':
    main()