Catch universal newlines but preserve original

2019-08-24 00:41发布

问题:

So this is my problem,

I'm trying to do a simple program that runs another process using Python's subprocess module, and I want to catch real-time output of the process.

I know this can be done as such:

proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

for line in iter(proc.stdout.readline, ""):
    line = line.rstrip()
    if line != "":
        print(line)

The issue is, the process might generate output with a carriage return \r, and I want to simulate that behavior in my program.

If I use the universal_newlines flag in Popen, then I could catch the output that is generated with a carriage return, but I wouldn't know it was as such, and I could only print it "regularly" with a newline. I want to avoid that, as this could be a lot of output.

My question is basically if I could catch the \r output like it is a \n but differentiate it from actual \n output

EDIT

Here is some simplified code of what I tried:

File download.py:

import subprocess

try:
    subprocess.check_call(
        [
            "aws",
            "s3",
            "cp",
            "S3_LINK",
            "TARGET",
        ]
    )

except subprocess.CalledProcessError as err:
    print(err)
    raise SystemExit(1)

File process_runner.py:

import os
import sys

import subprocess

proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

for char in iter(lambda: proc.stdout.read(1), ""):
    sys.stdout.write(char)

The code in download uses aws s3 cp, which gives carriage returns of the download progress. I want to simulate this behavior of output in my program process_runner which receives download's output.

At first I tried to iter readline instead of read(1). That did not work due to the CR being overlooked.

回答1:

A possible way is to use the binary interface of Popen by specifying neither encoding nor error and of course not universal_newline. And then, we can use a TextIOWrapper around the binary stream, with newline=''. Because the documentation for TextIOWrapper says:

... if newline is None... If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated

(which is conformant with PEP 3116)

You original code could be changed to:

proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
out = io.TextIOWrapper(proc.stdout, newline='')

for line in out:
    # line is delimited with the universal newline convention and actually contains
    #  the original end of line, be it a raw \r, \n of the pair \r\n
    ...