I'm working on a music classification methodology with Scikit-learn, and the first step in that process is converting a music file to a numpy array.
After unsuccessfully trying to call ffmpeg from a python script, I decided to simply pipe the file in directly:
FFMPEG_BIN = "ffmpeg"
cwd = (os.getcwd())
dcwd = (cwd + "/temp")
if not os.path.exists(dcwd): os.makedirs(dcwd)
folder_path = sys.argv[1]
f = open("test.txt","a")
for f in glob.glob(os.path.join(folder_path, "*.mp3")):
ff = f.replace("./", "/")
print("Name: " + ff)
aa = (cwd + ff)
command = [ FFMPEG_BIN,
'-i', aa,
'-f', 's16le',
'-acodec', 'pcm_s16le',
'-ar', '22000', # ouput will have 44100 Hz
'-ac', '1', # stereo (set to '1' for mono)
'-']
pipe = sp.Popen(command, stdout=sp.PIPE, bufsize=10**8)
raw_audio = pipe.proc.stdout.read(88200*4)
audio_array = numpy.fromstring(raw_audio, dtype="int16")
print (str(audio_array))
f.write(audio_array + "\n")
The problem is, when I run the file, it starts ffmpeg and then does nothing:
[mp3 @ 0x1446540] Estimating duration from bitrate, this may be inaccurate
Input #0, mp3, from '/home/don/Code/Projects/MC/Music/Spaz.mp3':
Metadata:
title : Spaz
album : Seeing souns
artist : N*E*R*D
genre : Hip-Hop
encoder : Audiograbber 1.83.01, LAME dll 3.96, 320 Kbit/s, Joint Stereo, Normal quality
track : 5/12
date : 2008
Duration: 00:03:50.58, start: 0.000000, bitrate: 320 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, stereo, s16p, 320 kb/s
Output #0, s16le, to 'pipe:':
Metadata:
title : Spaz
album : Seeing souns
artist : N*E*R*D
genre : Hip-Hop
date : 2008
track : 5/12
encoder : Lavf56.4.101
Stream #0:0: Audio: pcm_s16le, 22000 Hz, mono, s16, 352 kb/s
Metadata:
encoder : Lavc56.1.100 pcm_s16le
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
It just sits there, hanging, for far longer than the song is. What am I doing wrong here?,