Parallel Python - too many files

2019-08-17 02:28发布

I'm trying to run a code in parallel using parallel python that adds some numbers together. It all works fine except that when I iterate the code in a loop it inevitably stops (on my computer) after 41 iterations due to a "too many files error". I've looked into this a good bit and I've found a solution that works, but makes the code much slower than it would be to run not in parallel which is useless.

import sys, time
import pp
import numpy
x = numpy.arange(-20.0,20.0,0.5)
k = numpy.arange(50)
grav = []
nswarm = 4
gravity = numpy.zeros([4,1])
print gravity
def function(raw_input,x,grav,k):
    f = 0
    for i in range(len(x)):
        f+=1
    a=raw_input[0]
    b=raw_input[1]
    c=raw_input[2]
    d=raw_input[3]
    grav.append((a+b+c+d)+f)
    #return grav

jobsList = []

for i in range(len(k)):
    # tuple of all parallel python servers to connect with
    ppservers = ()
    #ppservers = ("10.0.0.1",)

    if len(sys.argv) > 1:
        ncpus = int(sys.argv[1])
        # Creates jobserver with ncpus workers
        job_server = pp.Server(ncpus, ppservers=ppservers)
    else:
        # Creates jobserver with automatically detected number of workers
        job_server = pp.Server(ppservers=ppservers)

    #print "Starting pp with", job_server.get_ncpus(), "workers"
    start_time = time.time()

    # The following submits 4 jobs and then retrieves the results
    puts = ([1,2,3,4], [3,2,3,4],[4,2,3,6],[2,3,4,5])

    jobs = [(raw_input, job_server.submit(function,(raw_input,x,grav,k), (destroy,), ())) for raw_input in puts]
    for raw_input, job in jobs:
        r = job()
        jobsList.append(r)
        #print "Sum of numbers", raw_input, "is", r
    #print "Time elapsed: ", time.time() - start_time, "s"
    #job_server.print_stats()
    #for job in jobsList:
    #print job

    #print jobsList
    for n in numpy.arange(nswarm):
        gravity[n] = jobsList[n]
    del grav[0:len(grav)]
    del jobsList[0:len(jobsList)]
    #print gravity,'here' 
    print i
    job_server.destroy()

The problem is the iterating over the "job_server" too much without properly closing the severs - I think, and adding the job_server.destroy() was the solution that I found that works in that the code runs to completion but it is really slow.

Is there a better way to close the servers down so that the code will be reasonably fast?

0条回答
登录 后发表回答