How to process a list in parallel in Python? [dupl

2020-03-30 15:42发布

问题:

I wrote code like this:

def process(data):
   #create file using data

all = ["data1", "data2", "data3"]

I want to execute process function on my all list in parallel, because they are creating small files so I am not concerned about disk write but the processing takes long, so I want to use all of my cores.

How can I do this using default modules in python 2.7?

回答1:

You may try a basic example like:

from threading import Thread

def process(data):
    print "processing %s" % data

all = ["data1", "data2", "data3"]

for task in all:
    t = Thread(target=process, args=(task,))
    t.start()

Here's a repl and an brief tutorial which shows how to let your caller pause for the threads to join if desired.

In regards to using all your cores, I don't have any information on that, but here are some resources that might be helpful: [1], [2], [3]



回答2:

There is a template of using multiprocessing, hope helpful.

from multiprocessing.dummy import Pool as ThreadPool

def process(data):
    print("processing {}".format(data))
alldata = ["data1", "data2", "data3"]

pool = ThreadPool()

results = pool.map(process, alldata)

pool.close()
pool.join()


回答3:

Or:

from threading import Thread

def process(data):
    print("processing {}".format(data))

l= ["data1", "data2", "data3"]

for task in l:
    t = Thread(target=process, args=(task,))
    t.start()

Or (only python version > 3.6.0):

from threading import Thread

def process(data):
    print(f"processing {data}")

l= ["data1", "data2", "data3"]

for task in l:
    t = Thread(target=process, args=(task,))
    t.start()