Which MPI implementations currently have support for fault tolerance, and what is the state of their development?
相关问题
- About MPI_Reduce
- Boost MPI doesn't free resources when listenin
- Kill an mpi process
- Detecting not using MPI when running with mpirun/m
- CMake: set path to MPI headers and binaries manual
This question is probably too broad to give you a good answer here, especially since the answer will change as time progresses.
In general, there's lots of fault tolerant work going on with various MPI implementations that is in various states of support.
There's lot of other MPI libraries that implement some form of fault tolerance on top of MPI or make some sort of tweaks to the implementation itself. These are just a couple of options.