I'm trying to make a simple function to wrap around FFProbe, and most of the data can be retrieved correctly.
The problem is when actually printing the strings to the command line using both Windows Command Prompt and Git Bash for Windows, the output appears mangled and out of order.
Some songs (specifically the file Imagine Dragons - Hit Parade_ Best of the Dance Music Charts\80 - Beazz - Lime (Extended Mix).flac
) are missing metadata. I don't know why, but the dictionary the function below returns is empty.
FFProbe outputs its results to stderr
which can be piped to subprocess.PIPE
, decoded, and parsed. I chose regex for the parsing bit.
This is a slimmed down version of my code below, for the output take a look at the Github gist.
#! /usr/bin/env python3
# -*- coding: utf-8 -*-
import os
from glob import glob
from re import findall, MULTILINE
from subprocess import Popen, PIPE
def glob_from(path, ext):
"""Return glob from a directory."""
working_dir = os.getcwd()
os.chdir(path)
file_paths = glob("**/*." + ext)
os.chdir(working_dir)
return file_paths
def media_metadata(file_path):
"""Use FFPROBE to get information about a media file."""
stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE).communicate()[1].decode()
metadata = {}
for match in findall(r"(\w+)\s+:\s(.+)$", stderr, MULTILINE):
metadata[match[0].lower()] = match[1]
return metadata
if __name__ == "__main__":
base = "C:/Users/spike/Music/Deezloader"
for file in glob_from(base, "flac"):
meta = media_metadata(os.path.join(base, file))
title_length = meta.get("title", file) + " - " + meta.get("length", "000")
print(title_length)
Output Gist Output Raw
I don't understand why the output (the strings can be retrieved from the regex pattern effectively, however the output is strangely formatted when printing) appears disordered only when printing to the console using python's print
function. It doesn't matter how I build the string to print, concatenation, comma-delimited arguments, whatever.
I end up with the length of the song first, and the song name second but without space between the two. The dash is hanging off the end for some reason. Based on the print statement in the code before, the format should be Title - 000
({title} - {length}
) but the output looks more like 000Title -
. Why?