What would be an inter-process communication (IPC) framework\technique with the following requirements:
- Transfer native Python objects between two Python processes
- Efficient in time and CPU (RAM efficiency irrelevant)
- Cross-platform Win\Linux
- Nice to have: works with PyPy
UPDATE 1: the processes are on the same host and use the same versions of Python and other modules
UPDATE 2: the processes are run independently by the user, no one of them spawns the others
Native objects don't get shared between processes (due to reference counting).
Instead, you can pickle them and share them using unix domain sockets, mmap, zeromq, or an intermediary such a sqlite3 that is designed for concurrent accesses.
Use multiprocessing to start with.
If you need multiple CPU's, look at celery.
Both execnet and Pyro mention PyPy <-> CPython
communication. Other packages from Python Wiki's Parallel Processing page are probably suitable too.
Parallel Python might be worth a look, it works on Windows, OS X, and Linux (and I seem to recall I used it on a UltraSPARC Solaris 10 machine a while back). I don't know if it works with PyPy, but it does seem to work with Psyco.
After some test, I found that the following approach works for Linux using mmap
.
Linux has /dev/shm
. If you create a shared memory using POSIX shm_open
, a new file is created in this folder.
Although python's mmap
module does not provide the shm_open
function. we can use a normal open
to create a file in /dev/shm
and it is actually similar and reside in memory. (Use os.unlink
to remove it)
Then for IPC, we can use mmap
to map that file to the different processes' virtual memory space. All the processes share that memory. Python can use the memory as buffer and create object such as bytes and numpy arrays on top of it. Or we can use it through the ctypes
interface.
Of course, process sync primitives are still needed to avoid race conditions.
See mmap doc, ctypes doc and numpy.load
which has an mmap_mode
option.