I am trying to write a script which has to make a lot of calls to some bash commands, parse and process the outputs and finally give some output.
I was using subprocess.Popen and subprocess.call
If I understand correct these methods spawn a bah process, run the command, get the output and then kill the process.
Is there a way to have a bash process running in the background continuously and then the python calls could just go directly to that process? This would be something like bash running as a server and python calls going to it.
I feel this would optimize the calls a bit as there is no bash process setup and teardown. Or will it give no performance advantage?
subprocess.Popen
is a bit more involved. It actually creates an I/O thread to avoid deadlocks. See https://www.python.org/dev/peps/pep-0324/:Sure, you can still use
subprocess.Popen
and send messages to you subprocess and receive messages back without terminating the subprocess. In the simplest case your messages can be lines.This allows for request-response style protocols as well as publish-subscribe when the subprocess can keep sending you messages back when an event of interest happens.
subprocess
never runs the shell unless you ask it explicitly e.g.,This call runs
ls
program without invoking/bin/sh
.If your subprocess calls actually use the shell e.g., to specify a pipeline consicely or you use bash process substitution that could be verbose and error-prone to define using
subprocess
module directly then it is unlikely that invokingbash
is a performance bottleneck -- measure it first.There are Python packages that too allow to specify such commands consicely e.g.,
plumbum
could be used to emulate a shell pipeline.If you want to use
bash
as a server process thenpexpect
is useful for dialog-based interactions with an external process -- though it is unlikely that it affects time performance.fabric
allows to run both local and remote commands (ssh
).There are other subprocess wrappers such as
sarge
which can parse a pipeline specified in a string without invoking the shell e.g., it enables cross-platform support for bash-like syntax (&&
,||
,&
in command lines) orsh
-- a completesubprocess
replacement on Unix that provides TTY by default (it seems full-featured but the shell-like piping is less straightforward). You can even use Python-ish BASHwards-looking syntax to run commands withxonsh
shell. Again, it is unlikely that it affects performance in a meaningful way in most cases.The problem of starting and communicating with external processes in a portable manner is complex -- the interaction between processes, pipes, ttys, signals, threading, async. IO, buffering in various places has rough edges. Introducing a new package may complicate things if you don't know how a specific package solve numerous issues related to running shell commands.