distributed python programming

2019-05-27 00:25发布

问题:

I am trying to split the execution of a python program to two different machines. I am wondering if there's a way to call the python interpreter on one machine from another. Not running a script on another machine, but rather split the task of execution to two machines.

Over the course of the next couple of months, I will be teaching my self distributed programming, and I thought this would be a good way to start.

I think the first step is to use one machine to call another machine and send it a piece of the program. Then the next step would be that both machines execute the same program together and communicate to avoid problems. The third step would be three machines, etc.

Advice, tips, and thoughts are all welcome...

回答1:

Disclamer: I am a developer of SCOOP.

Data-based technologies you may want to get acquainted with for distributed processing would be the MPI standard (for multi-computers, using mpi4py [prefered] or pympi) and the standard multiprocessing module allowing remote computation (but awkward, from my point of view).

You should begin with task-based frameworks, though. They provides a simple and user-friendly usage. Both of these were an utmost focus while creating SCOOP. You can try it with pip -U scoop. On Windows, you may wish to install PyZMQ first using their executable installers. You can check the provided examples and play with the various parameters to understand what causes performance degradation or increase with ease. I encourage you to compare it to its alternatives such as Celery for similar work.

Both of these frameworks allow remote launching of Python programs. More importantly, it does the parallel processing for you while you only need to feed them with your tasks.

You may want to check Fabric for an easy way to setup your remote environments or even control or launch scripts remotely.



回答2:

There is MPI version for Python [1] [2].

MPI (Message Passing Interface) is a standardized interface and it is cool because you find it also in C, Java, (Fortran) etc.

It enables you to communicate between your processes that run remote. You use these messages for synchronization and for information passing.

You also have collective operations, like broadcast, gather, reduce



回答3:

Have a look at RPyC, you might find it usefull.



回答4:

Check out Ray, which is a library for writing parallel and distributed Python.

Ray uses the same syntax to parallelize code on a single multicore machine and in the distributed setting.

If you add the @ray.remote decorator to a function, it can be executed asynchronously in parallel (on any machine in the cluster). Remote function invocations return futures, whose values can be retrieved with ray.get.

The same thing can be done with Python classes (instead of functions), see the documentation for actors.

import ray
import time

ray.init()

@ray.remote
def function(x):
    time.sleep(1)
    return x

args = [1, 2, 3, 4]

# Submit 4 tasks in parallel.
result_ids = [function.remote(x) for x in args]

# Retrieve the results. Assuming at least 4 cores,
# this will take 1 second.
results = ray.get(result_ids)

See the Ray documentation for more. Note, I'm one of the Ray developers.