In MPI, is it possible to add new nodes after it is started? For example, I have 2 computers already running a parallel MPI application. I start another instance of this application on a third computer and add it to the existing communicator. All computers are in a local network.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
No, it's not currently possible to add new nodes to a running MPI application. MPI is designed to know the total number of nodes when the program starts.
Work is being done (on MPI-3, for example) on handling nodes that go down. Maybe if you can add faulty nodes back, then you can add new ones, but that's the closest thing I can think of. See this answer for more info on approaches to MPI fault tolerance.
回答2:
It is possible for a MPI2 program to spawn new ranks. The function is MPI_Comm_spawn and it starts up children on a new MPI communicator. That is to say the new ranks have a different MPI_COMM_WORLD from the previously running ranks. It should be possible to make a new communicator that contains all of the current running ranks though.