从子实时捕获标准输出(catching stdout in realtime from subpro

2019-08-31 10:37发布

站内文章 / 前沿技术

19 0

乱世女痞

女 | 书童

私信

我想subprocess.Popen() rsync.exe在Windows中，并打印在Python的标准输出。

我的代码工作，但它不会赶进度，直至文件传输完成！我要打印实时每个文件的进度。

使用Python 3.1，因为现在我听说它应该是在处理IO更好。

import subprocess, time, os, sys

cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'


p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=64,
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)

for line in p.stdout:
    print(">>> " + str(line.rstrip()))
    p.stdout.flush()

Answer 1:

拇指的一些规则subprocess 。

切勿使用shell=True 。它不必要调用额外的壳进程调用程序。
当调用程序，参数传递四周，列表。 sys.argv的蟒蛇是一个列表，所以argv在C.所以你传递一个列表 Popen调用子进程，而不是一个字符串。
不重定向stderr的PIPE当你不读它。
不重定向stdin ，当你不写它。

例：

import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]

p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE,
                     stderr=subprocess.STDOUT)

for line in iter(p.stdout.readline, b''):
    print(">>> " + line.rstrip())

这就是说，它是可能的，当它检测到它被连接到配管，而不是终端的rsync缓冲器其输出。这是默认的行为 - 当连接到一个管道，计划必须明确刷新标准输出的实时结果，否则标准C库会缓冲。

为了测试这，尝试改为运行以下命令：

cmd = [sys.executable, 'test_out.py']

并创建一个test_out.py与文件的内容：

import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")

执行该子应该给你“你好”，给“世界”之前等待10秒。如果上面而不是与Python代码发生rsync ，这意味着rsync本身是缓冲输出，所以你的运气了。

一个解决办法是直接连接到pty ，使用类似pexpect 。

Answer 2:

我知道这是一个老话题，但有一个解决方案现在。呼叫与选项--outbuf = L rsync的。例：

cmd=['rsync', '-arzv','--backup','--outbuf=L','source/','dest']
p = subprocess.Popen(cmd,
                     stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, b''):
    print '>>> {}'.format(line.rstrip())

Answer 3:

在Linux上，我都有摆脱缓冲的同样的问题。我终于用“stdbuf -o0”（或无缓冲的期望），以摆脱管道缓冲。

proc = Popen(['stdbuf', '-o0'] + cmd, stdout=PIPE, stderr=PIPE)
stdout = proc.stdout

然后我可以使用select.select在标准输出上。

又见https://unix.stackexchange.com/questions/25372/

Answer 4:

for line in p.stdout:
  ...

总是块，直到下一个换行。

对于“实时”的行为，你必须做这样的事情：

while True:
  inchar = p.stdout.read(1)
  if inchar: #neither empty string nor None
    print(str(inchar), end='') #or end=None to flush immediately
  else:
    print('') #flush for implicit line-buffering
    break

当子进程关闭其标准输出或退出while循环留下。 read()/read(-1)将阻塞，直到子进程关闭了其标准输出或退出。

Answer 5:

你的问题是：

for line in p.stdout:
    print(">>> " + str(line.rstrip()))
    p.stdout.flush()

迭代器本身有额外的缓冲。

尝试这样做是这样的：

while True:
  line = p.stdout.readline()
  if not line:
     break
  print line

Answer 6:

你不能得到标准输出打印缓冲到管道（除非你可以重写打印到标准输出的程序），所以这里是我的解决方案：

重定向标准输出到sterr，这不是缓冲。 '<cmd> 1>&2'应该这样做。打开过程如下： myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)
不能从标准输出或标准错误区分开来，但你立即获得所有输出。

希望这可以帮助任何人解决这个问题。

Answer 7:

从rsync的过程中更改标准输出是缓冲。

p = subprocess.Popen(cmd,
                     shell=True,
                     bufsize=0,  # 0=unbuffered, 1=line-buffered, else buffer-size
                     stdin=subprocess.PIPE,
                     stderr=subprocess.PIPE,
                     stdout=subprocess.PIPE)

Answer 8:

为了避免输出缓存你可能会想尝试Pexpect，第

child = pexpect.spawn(launchcmd,args,timeout=None)
while True:
    try:
        child.expect('\n')
        print(child.before)
    except pexpect.EOF:
        break

PS：我知道这个问题是很老的，仍然提供这为我工作的解决方案。

PPS：得到了另一个问题，这个答案

Answer 9:

    p = subprocess.Popen(command,
                                bufsize=0,
                                universal_newlines=True)

我写的python rsync的一个GUI，并且具有相同的probelms。这个问题困扰了我好几天，直到我发现这是pydoc。

如果universal_newlines为True，则文件的对象输出和错误被打开，如通用换行模式下的文本文件。线路可以由任何“\ n”，Unix的结束行惯例，“\ r”，旧的Macintosh公约或“\ r \ n”，Windows约定的终止。所有这些对外交涉中被看作是由Python程序“\ n”。

看来，rsync的将输出“\ r”，此翻译是怎么回事。

Answer 10:

根据不同的使用情况，您可能还需要禁用的子本身的缓冲。

如果子进程将是一个Python的过程中，您可以在通话之前做到这一点：

os.environ["PYTHONUNBUFFERED"] = "1"

或备选地，在通过这个env参数Popen 。

否则，如果你是在Linux / Unix，你可以使用stdbuf工具。例如喜欢：

cmd = ["stdbuf", "-oL"] + cmd

另请参见这里大约stdbuf或其他选项。

Answer 11:

我注意到，没有使用临时文件作为中间的提及。下面通过输出到一个临时文件周围的缓冲问题得到，并允许您解析来自rsync的未来数据而无需连接到PTY。我测试在Linux中以下，和rsync的输出趋于跨平台不同，所以正则表达式来解析输出可能会有所不同：

import subprocess, time, tempfile, re

pipe_output, file_name = tempfile.TemporaryFile()
cmd = ["rsync", "-vaz", "-P", "/src/" ,"/dest"]

p = subprocess.Popen(cmd, stdout=pipe_output, 
                     stderr=subprocess.STDOUT)
while p.poll() is None:
    # p.poll() returns None while the program is still running
    # sleep for 1 second
    time.sleep(1)
    last_line =  open(file_name).readlines()
    # it's possible that it hasn't output yet, so continue
    if len(last_line) == 0: continue
    last_line = last_line[-1]
    # Matching to "[bytes downloaded]  number%  [speed] number:number:number"
    match_it = re.match(".* ([0-9]*)%.* ([0-9]*:[0-9]*:[0-9]*).*", last_line)
    if not match_it: continue
    # in this case, the percentage is stored in match_it.group(1), 
    # time in match_it.group(2).  We could do something with it here...

Answer 12:

在Python 3，这里有一个解决方案，这需要一个命令关闭命令行，并提供很好的解码字符串因为他们收到的实时性。

接收机（ receiver.py ）：

import subprocess
import sys

cmd = sys.argv[1:]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
    print("received: {}".format(line.rstrip().decode("utf-8")))

实施例简单的程序，可以产生实时输出（ dummy_out.py ）：

import time
import sys

for i in range(5):
    print("hello {}".format(i))
    sys.stdout.flush()  
    time.sleep(1)

输出：

$python receiver.py python dummy_out.py
received: hello 0
received: hello 1
received: hello 2
received: hello 3
received: hello 4

文章来源: catching stdout in realtime from subprocess

标签： python subprocess stdout

乱世女痞

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~