Mangled output when printing strings from FFProbe

2019-08-20 09:51发布

I'm trying to make a simple function to wrap around FFProbe, and most of the data can be retrieved correctly.

The problem is when actually printing the strings to the command line using both Windows Command Prompt and Git Bash for Windows, the output appears mangled and out of order.

Some songs (specifically the file Imagine Dragons - Hit Parade_ Best of the Dance Music Charts\80 - Beazz - Lime (Extended Mix).flac) are missing metadata. I don't know why, but the dictionary the function below returns is empty.

FFProbe outputs its results to stderr which can be piped to subprocess.PIPE, decoded, and parsed. I chose regex for the parsing bit.

This is a slimmed down version of my code below, for the output take a look at the Github gist.

#! /usr/bin/env python3
# -*- coding: utf-8 -*-

import os

from glob import glob
from re import findall, MULTILINE
from subprocess import Popen, PIPE


def glob_from(path, ext):
    """Return glob from a directory."""
    working_dir = os.getcwd()
    os.chdir(path)

    file_paths = glob("**/*." + ext)

    os.chdir(working_dir)

    return file_paths


def media_metadata(file_path):
    """Use FFPROBE to get information about a media file."""
    stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE).communicate()[1].decode()

    metadata = {}

    for match in findall(r"(\w+)\s+:\s(.+)$", stderr, MULTILINE):
        metadata[match[0].lower()] = match[1]

    return metadata


if __name__ == "__main__":
    base = "C:/Users/spike/Music/Deezloader"

    for file in glob_from(base, "flac"):
        meta = media_metadata(os.path.join(base, file))
        title_length = meta.get("title", file) + " - " + meta.get("length", "000")

        print(title_length)

Output Gist Output Raw

I don't understand why the output (the strings can be retrieved from the regex pattern effectively, however the output is strangely formatted when printing) appears disordered only when printing to the console using python's print function. It doesn't matter how I build the string to print, concatenation, comma-delimited arguments, whatever.

I end up with the length of the song first, and the song name second but without space between the two. The dash is hanging off the end for some reason. Based on the print statement in the code before, the format should be Title - 000 ({title} - {length}) but the output looks more like 000Title -. Why?

1条回答
Melony?
2楼-- · 2019-08-20 10:43

I solved this by the accepted answer in my related question.

I had forgotten about the return carriage at the end of each line. Solutions given are as follows:

  1. Use universal_newlines=True in the subprocess call.
    • stderr = Popen(("ffprobe", file_path), shell=True, stderr=PIPE, universal_newlines=True).communicate()[1]
  2. Stripping the whitespace around the line from stderr.

    • *.communicate()[1].decode().rstrip() to strip all whitespace at the end.
    • *.communicate()[1].decode().strip() to strip all wightspace around.
    • *.communicate()[1].decode()[:-2] to remove the last two characters.
  3. Swallowing \r in the regex pattern.

    • findall(r"(\w+)\s+:\s(.+)\r$", stderr, MULTILINE)

This is all very helpful, however I used none of these suggestions.

I didn't know that FFPROBE offers JSON output to STDOUT, but it does. The code to do that is below.

#! /usr/bin/env python3
# -*- coding: utf-8 -*-
from json import loads
from subprocess import check_output, DEVNULL, PIPE


def arg_builder(args, kwargs, defaults={}):
    """Build arguments from `args` and `kwargs` in a shell-lexical manner."""
    for key, val in defaults.items():
        kwargs[key] = kwargs.get(key, val)

    args = list(args)

    for arg, val in kwargs.items():
        if isinstance(val, bool):
            if val:
                args.append("-" + arg)
        else:
            args.extend(("-" + arg, val))

    return args


def run_ffprobe(file_path, *args, **kwargs):
    """Use FFPROBE to get information about a media file."""
    return loads(check_output(("ffprobe", arg_builder(args, kwargs, defaults={"show_format": True}),
                               "-of", "json", file_path), shell=True, stderr=DEVNULL))

You might also get some use out of the arg_builder(). It isn't perfect, but it's good enough for simple shell commands. It isn't made to be idiot proof, it was written with a few holes assuming that the programmer won't break anything.

查看更多
登录 后发表回答