Computing usage of independent cores and binding a

2019-09-12 04:43发布

I am working with MPI, and I have a certain hierarchy of operations. For a particular value of a parameter _param, I launch 10 trials, each running a specific process on a distinct core. For n values of _param, the code runs in a certain hierarchy as:

driver_file -> launches one process which checks if available processes are more than 10. If more than 10 are available, then it launches an instance of a process with a specific _param value passed as an argument to coupling_file

coupling_file -> does some elementary computation, and then launches 10 processes using MPI_Comm_spawn(), each corresponding to a trial_file while passing _trial as an argument

trial_file -> computes work, returns values to the coupling_file

I am facing two dilemmas, namely:

  1. How do I evaluate the required condition for the cores in driver_file? As in, how do I find out how many processes have been terminated, so that I can correctly schedule processes on idle cores? I thought maybe adding a blocking MPI_Recv() and use it to pass a variable which would tell me when a certain process has been finished, but I'm not sure if this is the best solution.

  2. How do I ensure that processes are assigned to different cores? I had thought about using something like mpiexec --bind-to-core --bycore -n 1 coupling_file to launch one coupling_file. This will be followed by something like mpiexec --bind-to-core --bycore -n 10 trial_file launched by the coupling_file. However, if I am binding processes to a core, I don't want the same core to have two/more processes. As in, I don't want _trial_1 of _coupling_1 to run on core x, then I launch another process of coupling_2 which launches _trial_2 which also gets bound to core x.

Any input would be appreciated. Thanks!

1条回答
家丑人穷心不美
2楼-- · 2019-09-12 05:15

If it is an option for you, I'd drop the spawning processes thing altogether, and instead start all processes at once. You can then easily partition them into chunks working on a single task. A translation of your concept could for example be:

  • Use one master (rank 0)
  • Partition the rest into groups of 10 processes, maybe create a new communicator for each group if needed, each group has one leader process, known to the master.

In your code you then can do something like:

if master:
    send a specific _param to each group leader (with a non-blocking send)
    loop over all your different _params
        use MPI_Waitany or MPI_Waitsome to find groups that are ready
else
    if groupleader:
        loop endlessly
            MPI_Recv _params from master
            coupling_file
            MPI_Bcast to group
            process trial_file
    else
        loop endlessly
            MPI_BCast (get data from groupleader)
            process trial file

I think, following this approach would allow you to solve both your issues. Availability of process groups gets detected by MPI_Wait*, though you might want to change the logic above, to notify the master at the end of your task so it only sends new data then, not already during the previous trial is still running, and another process group might be faster. And pinning is resolved as you have a fixed number of processes, which can be properly pinned during the usual startup.

查看更多
登录 后发表回答