For an internship on the Python library fluidimage, we are investigating if it could be a good idea to write a HPC parallel application with a client/servers model using the library trio.
For asynchronous programming and i/o, trio is indeed great!
Then, I'm wondering how to
- spawn processes (the servers doing the CPU-GPU bounded work)
- communicating complex Python objects (potentially containing large numpy arrays) between the processes.
I didn't find what was the recommended way to do this with trio in its documentation (even if the echo client/server tutorial is a good start).
One obvious way for spawning processes in Python and communicate is using multiprocessing.
In the HPC context, I think one good solution would be to use MPI (http://mpi4py.readthedocs.io/en/stable/overview.html#dynamic-process-management). For reference, I also have to mention rpyc (https://rpyc.readthedocs.io/en/latest/docs/zerodeploy.html#zerodeploy).
I don't know if one can use such tools together with trio and what would be the right way to do this.
An interesting related question
Remark PEP 574
It seems to me that the PEP 574 (see https://pypi.org/project/pickle5/) could also be part of a good solution to this problem.
As of mid-2018, Trio doesn't do that yet. Your best option to date is to use
trio_asyncio
to leverage asyncio's support for the features which Trio still needs to learn.Unfortunately, as of today (July 2018), Trio doesn't yet have support for spawning and communicating with subprocesses, or any kind of high-wrappers for MPI or other high-level inter-process coordination protocols.
This is definitely something we want to get to eventually, and if you want to talk in more detail about what would need to be implemented, then you can hop in our chat, or this issue has an overview of what's needed for core subprocess support. But if your goal is to have something working within a few months for your internship, honestly you might want to consider more mature HPC tools like dask.
I post a very naive example of a code using multiprocessing and trio (in the main program and in the server). It seems to work.
A simple example with mpi4py... It may be a bad work around from the trio point of view, but it seems to work.
Communications are done with
trio.run_sync_in_worker_thread
so (as written by Nathaniel J. Smith) (1) no cancellation (and no control-C support) and (2) use more memory than trio tasks (but one Python thread does not use so much memory).But for communications involving large numpy arrays, I would go like this since communication of buffer-like objects is going to be very efficient with mpi4py.