How to make a python module with function to run c

2019-07-25 02:34发布

I wanted to make a python module with a convenience function for running commands in parallel using Python 3.7 on Windows. (for az cli commands)

I wanted a to make a function that:

  • Was easy to use: Just pass a list of commands as strings, and have them execute in parallel.
  • Let me see the output generated by the commands.
  • Used build in python libraries
  • Worked equally well on Windows and Linux (Python Multiprocessing uses fork(), and Windows doesn't have fork(), so sometimes Multiprocessing code will work on Linux but not Windows.)
  • Could be made into an importable module for greater convenience.

This was surprisingly difficult, I think maybe it used to not be possible in older versions of python? (I saw several 2-8 year old Q&As that said you had to use if __name__==__main__: to pull off parallel processing, but I discovered that didn't work in a consistently predictable way when it came to making a importable module.

def removeExtraLinesFromString(inputstring):
    stringtoreturn = ""
    for line in inputstring.split("\n"):
        if len(line.strip()) > 0: #Only add non empty lines to the stringtoreturn
            stringtoreturn = stringtoreturn + line
    return stringtoreturn


def runCmd(cmd): #string of a command passed in here
    from subprocess import run, PIPE
    stringtoreturn = str( run(cmd, shell=True, stdout=PIPE).stdout.decode('utf-8') )
    stringtoreturn = removeExtraLinesFromString(stringtoreturn)
    return stringtoreturn


def exampleOfParrallelCommands():
    if __name__ == '__main__': #I don't like this method, because it doesn't work when imported, refractoring attempts lead to infinite loops and unexpected behavior.
        from multiprocessing import Pool
        cmd = "python -c \"import time;time.sleep(5);print('5 seconds have passed')\""
        cmds = []
        for i in range(12):  #If this were running in series it'd take at least a minute to sleep 5 seconds 12 times
            cmds.append(cmd)
        with Pool(processes=len(cmds)) as pool:
            results = pool.map(runCmd, cmds) #results is a list of cmd output
        print(results[0])
        print(results[1])
        return results

When I tried importing this as a module it didn't work (makes since because of the if statement), so I tried rewriting the code to move the if statement around, I think I removed it once which caused my computer to go into a loop until I shut the program. Another time I was able to import the module into another python program, but to make that work I had to add __name__ == '__main__' and that's very intuitive.

I almost gave up, but after 2 days of searching though tons of python websites and SO posts I finally figured out how to do it after seeing user jfs's code in this Q&A (Python: execute cat subprocess in parallel) I modified his code so it'd better fit into an answer to my question.

1条回答
够拽才男人
2楼-- · 2019-07-25 03:22

toolbox.py

def removeExtraLinesFromString(inputstring):
    stringtoreturn = ""
    for line in inputstring.split("\n"):
        if len(line.strip()) > 0: #Only add non empty lines to the stringtoreturn
            stringtoreturn = stringtoreturn + line
    return stringtoreturn


def runCmd(cmd): #string of a command passed in here
    from subprocess import run, PIPE
    stringtoreturn = str( run(cmd, shell=True, stdout=PIPE).stdout.decode('utf-8') )
    stringtoreturn = removeExtraLinesFromString(stringtoreturn)
    return stringtoreturn


def runParallelCmds(listofcommands): 
    from multiprocessing.dummy import Pool #thread pool
    from subprocess import Popen, PIPE, STDOUT
    listofprocesses = [Popen(listofcommands[i], shell=True,stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True) for i in range(len(listofcommands))] 
    #Python calls this list comprehension, it's a way of making a list
    def get_outputs(process): #MultiProcess Thread Pooling require you to map to a function, thus defining a function.
        return process.communicate()[0] #process is object of type subprocess.Popen
    outputs = Pool(len(listofcommands)).map(get_outputs, listofprocesses) #outputs is a list of bytes (which is a type of string)
    listofoutputstrings = []
    for i in range( len(listofcommands) ):
        outputasstring = removeExtraLinesFromString(  outputs[i].decode('utf-8')  ) #.decode('utf-8') converts bytes to string
        listofoutputstrings.append( outputasstring )
    return listofoutputstrings

main.py

from toolbox import runCmd #(cmd)
from toolbox import runParallelCmds #(listofcommands)

listofcommands = []
cmd = "ping -n 2 localhost"
listofcommands.append(cmd)
cmd = "python -c \"import time;time.sleep(5);print('5 seconds have passed')\""
for i in range(12):
    listofcommands.append(cmd) # If 12 processes each sleep 5 seconds, this taking less than 1 minute proves parrallel processing

outputs = runParallelCmds(listofcommands)
print(outputs[0])
print(outputs[1])

output:

Pinging neokylesPC [::1] with 32 bytes of data: Reply from ::1: time<1ms Reply from ::1: time<1ms Ping statistics for ::1: Packets: Sent = 2, Received = 2, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms

5 seconds have passed

查看更多
登录 后发表回答