How to use classes with Python multiprocessing?

2019-02-27 18:04发布

Here's some sample code that is reads a file and adds up each line. It is supposed to add up all the numbers from 0-20. However, I always get a result of 0.

I can see that intermediate calculations are succeeding, so why is the final result 0?

Is there a better way to do this? I am trying to do more calcuations on a larger, more complex input file, and store some statistics as I go.

import multiprocessing
import StringIO

class Total():
    def __init__(self):
        self.total = 0

    def add(self, number):
        self.total += int(number)

    def __str__(self):
        return str(self.total)

total = Total()

def f(input):
    total.add(input)

# Create mock file
mock_file = StringIO.StringIO()
for i in range(20):
    mock_file.write("{}\n".format(i))
mock_file.seek(0)

# Compute
pool = multiprocessing.Pool(processes=4)
pool.map(f, mock_file)

print total

# Cleanup
mock_file.close()

2条回答
时光不老,我们不散
2楼-- · 2019-02-27 18:35

Each subprocess calling f updates its own copy of total and therefore main process's total is not affected.

You can have each subprocess return the result of its computation (in your mock example, that's just the input, unchanged), and then accumulate it in the main process. E.g.:

def f(input):
  return input

results = pool.map(f, mock_file)
for res in results:
  total.add(res)
查看更多
做个烂人
3楼-- · 2019-02-27 18:45

You can accomplish this using shared memory with subprocess.Value, just change your Total class to the following:

class Total():
    def __init__(self):
        self.total = multiprocessing.Value('d', 0)

    def add(self, number):
        self.total.value += int(number)

    def __str__(self):
        return str(self.total.value)
查看更多
登录 后发表回答