Python 3.6
I want to take all input from a subprocess which I run with the subprocess
module. I can easily pipe this output to a log file, and it works great.
But, I want to filter out a lot of the lines (lots of noisy output from modules I do not control).
Attempt 1
def run_command(command, log_file):
process = subprocess.Popen(command, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, bufsize=1,
universal_newlines=True)
while True:
output = process.stdout.readline()
if output == '' and process.poll() is not None:
break
if output and not_noisy_line(output):
log_file.write(output)
log_file.flush()
return process.poll()
But this introduced a race condition between my subprocess and the output.
Attempt 2
I created a new method and a class to wrap the logging.
def run_command(command, log_file):
process = subprocess.run(command, stdout=QuiteLogger(log_file), stderr=QuiteLogger(log_file), timeout=120)
return process.returncode
class QuiteLogger(io.TextIOWrapper):
def write(self, data, encoding=sys.getdefaultencoding()):
data = filter(data)
super().write(data)
This does however just completely skip my filter function, my write method is not called at all by the subprocess. (If I call QuietLogger().write('asdasdsa')
it goes through the filters)
Any clues?
[Edit: My brain got derailed along the way, and I ended up answering another question than was actually asked. The following solution is useful for concurrently writing to a file, not for using the logging module in any way. However, since at least it's useful for that, I'll leave the answer in place for now.]
If you were just using threads, not separate processes, you'd just have to have a standard lock. So you could try something similar.
There's always the option of locking the output file. I don't know if your operating system supports anything like that, but the usual Unix way of doing it is to create a lock file. Basically, if the file exists, then wait; otherwise create the file before writing to your log file, and after you're done, remove the lock file again. You could use a context manager like this:
import os
import os.path
from time import sleep
class LockedFile():
def __init__(self, filename, mode):
self.filename = filename
self.lockfile = filename + '.lock'
self.mode = mode
def __enter__(self):
while True:
if os.path.isfile(self.lockfile):
sleep(0.1)
else:
break
with open(self.lockfile, 'a'):
os.utime(self.lockfile)
self.f = open(self.filename, self.mode)
return self.f
def __exit__(self, *args):
self.f.close()
os.remove(self.lockfile)
# And here's how to use it:
with LockedFile('blorg', 'a') as f:
f.write('foo\n')
This is an interesting situation in which the file object abstraction partially breaks down. The reason your solution does not work, is because subprocess
is not actually using your QuietLogger
but is getting the raw file number out of it (then repackaging it as a io.TextIOWrapper
object).
I don't know if this is an intrinsic limitation in how the subprocess
is handled, relying on OS support, or if this is just a mistake in the Python design, but in order to achieve what you want, you need to use the standard subprocess.PIPE
and then roll your own file writer.
If you can wait for the subprocess to finish, then it can be trivially done, using the subprocess.run
and then picking the stream out of the CompletedProcess
(p
) object:
p = subprocess.run(command, stdout=subprocess.PIPE, universal_newlines=True)
data = filter(p.stdout)
with open(logfile, 'w') as f:
f.write(data)
If you must work with the ouput while it is being generated (thus, you cannot wait for the subprocess
to end) the simplest way is to resort to subprocess.Popen
and threads:
import subprocess
import threading
logfile ='tmp.txt'
filter_passed = lambda line: line[:3] != 'Bad'
command = ['my_cmd', 'arg']
def writer(p, logfile):
with open(logfile, 'w') as f:
for line in p.stdout:
if filter_passed(line):
f.write(line)
p = subprocess.Popen(command, stdout=subprocess.PIPE, universal_newlines=True)
t = threading.Thread(target=writer, args=(p,logfile))
t.start()
t.join()