I'm wanting to run the Linux word count utility wc to determine the number of lines currently in the /var/log/syslog, so I can detect that it's growing. I've tried various test, and while I get the results back from wc, it includes both the line count as well as the command (e.g., var/log/syslog).
So it's returning:
1338 /var/log/syslog
But I only want the line count, so I want to strip off the /var/log/syslog portion, and just keep 1338.
I have tried converting it to string from bytestring, and then stripping the result, but no joy. Same story for converting to string and stripping, decoding, etc - all fail to produce the output I'm looking for.
These are some examples of what I get, with 1338 lines in syslog:
- b'1338 /var/log/syslog\n'
- 1338 /var/log/syslog
Here's some test code I've written to try and crack this nut, but no solution:
import subprocess
#check_output returns byte string
stdoutdata = subprocess.check_output("wc --lines /var/log/syslog", shell=True)
print("2A stdoutdata: " + str(stdoutdata))
stdoutdata = stdoutdata.decode("utf-8")
print("2B stdoutdata: " + str(stdoutdata))
stdoutdata=stdoutdata.strip()
print("2C stdoutdata: " + str(stdoutdata))
The output from this is:
2A stdoutdata: b'1338 /var/log/syslog\n'
2B stdoutdata: 1338 /var/log/syslog
2C stdoutdata: 1338 /var/log/syslog
2D stdoutdata: 1338 /var/log/syslog
I suggest that you use subprocess.getoutput()
as it does exactly what you want—run a command in a shell and get its string output (as opposed to byte string output). Then you can split on whitespace and grab the first element from the returned list of strings.
Try this:
import subprocess
stdoutdata = subprocess.getoutput("wc --lines /var/log/syslog")
print("stdoutdata: " + stdoutdata.split()[0])
To avoid invoking a shell and decoding filenames that might be an arbitrary byte sequence (except '\0'
) on *nix, you could pass the file as stdin:
import subprocess
with open(b'/var/log/syslog', 'rb') as file:
nlines = int(subprocess.check_output(['wc', '-l'], stdin=file))
print(nlines)
Or you could ignore any decoding errors:
import subprocess
stdoutdata = subprocess.check_output(['wc', '-l', '/var/log/syslog'])
nlines = int(stdoutdata.decode('ascii', 'ignore').partition(' ')[0])
print(nlines)
Since Python 3.6 you can make check_output()
return a str
instead of bytes
by giving it an encoding parameter:
check_output('wc --lines /var/log/syslog', encoding='UTF-8', shell=True)
Equivalent to Curt J. Sampson's answer is also this one (it's returning a string):
subprocess.check_output('wc -l /path/to/your/file | cut -d " " -f1', universal_newlines=True, shell=True)
from docs:
If encoding or errors are specified, or text is true, file objects for
stdin, stdout and stderr are opened in text mode using the specified
encoding and errors or the io.TextIOWrapper default. The
universal_newlines argument is equivalent to text and is provided for
backwards compatibility. By default, file objects are opened in binary
mode.
Something similar, but a bit more complex using subprocess.run():
subprocess.run(command, shell=True, check=True, universal_newlines=True, stdout=subprocess.PIPE).stdout
as subprocess.check_output() could be equivalent to subprocess.run().