I wanted to make a python module with a convenience function for running commands in parallel using Python 3.7 on Windows. (for az cli commands)
I wanted a to make a function that:
- Was easy to use: Just pass a list of commands as strings, and have them execute in parallel.
- Let me see the output generated by the commands.
- Used build in python libraries
- Worked equally well on Windows and Linux (Python Multiprocessing uses fork(), and Windows doesn't have fork(), so sometimes Multiprocessing code will work on Linux but not Windows.)
- Could be made into an importable module for greater convenience.
This was surprisingly difficult, I think maybe it used to not be possible in older versions of python? (I saw several 2-8 year old Q&As that said you had to use if __name__==__main__:
to pull off parallel processing, but I discovered that didn't work in a consistently predictable way when it came to making a importable module.
def removeExtraLinesFromString(inputstring):
stringtoreturn = ""
for line in inputstring.split("\n"):
if len(line.strip()) > 0: #Only add non empty lines to the stringtoreturn
stringtoreturn = stringtoreturn + line
return stringtoreturn
def runCmd(cmd): #string of a command passed in here
from subprocess import run, PIPE
stringtoreturn = str( run(cmd, shell=True, stdout=PIPE).stdout.decode('utf-8') )
stringtoreturn = removeExtraLinesFromString(stringtoreturn)
return stringtoreturn
def exampleOfParrallelCommands():
if __name__ == '__main__': #I don't like this method, because it doesn't work when imported, refractoring attempts lead to infinite loops and unexpected behavior.
from multiprocessing import Pool
cmd = "python -c \"import time;time.sleep(5);print('5 seconds have passed')\""
cmds = []
for i in range(12): #If this were running in series it'd take at least a minute to sleep 5 seconds 12 times
cmds.append(cmd)
with Pool(processes=len(cmds)) as pool:
results = pool.map(runCmd, cmds) #results is a list of cmd output
print(results[0])
print(results[1])
return results
When I tried importing this as a module it didn't work (makes since because of the if statement), so I tried rewriting the code to move the if statement around, I think I removed it once which caused my computer to go into a loop until I shut the program. Another time I was able to import the module into another python program, but to make that work I had to add __name__ == '__main__'
and that's very intuitive.
I almost gave up, but after 2 days of searching though tons of python websites and SO posts I finally figured out how to do it after seeing user jfs's code in this Q&A (Python: execute cat subprocess in parallel) I modified his code so it'd better fit into an answer to my question.
toolbox.py
main.py
output: