I'm having a little bit of programing and conversion trouble. I'm designing an AI to recognize notes played by instruments and need to extract the raw sound data from a wave file. My objective is to perform a FFT operation over chunks of time in the file for use by the AI. For this I need an amplitude list of the audio file, but I can't seem to find a conversion technique that will work. The files start as MP3's and then I convert them to wav file, but I always end up with a compressed file that spits out gibberish when I try to read it. Does anyone know how I might convert the wav file to something that would be compatible with Python's wave module or even something that would directly convert the data into an amplitude list?
问题:
回答1:
The default Python wave module isn't very thorough. You might try the one included in scipy as an alternative.
Check out: Reading *.wav files in Python
If you're going to do any numerical heavy lifting with the audio, scipy might be your best option anyway.
回答2:
I believe Python can read .dat files. You can use SoX to turn mp3s or wavs or whatever into .dat files that are simply a text list of "time - Left amp - Right amp"
The code is simply sox soundfile.mp3 soundfile.dat
http://sox.sourceforge.net/
Sox is command line - I run it with Terminal on my mac, but anything that understands Bash or Linux commands should work depending on what cpu you're using.
Hope that helps!
You might want to look at Pure Data too, it's got some nice FFT transforms built into an intuitive graphical programming language.