Is MPI a paradigm or a set of libraries?

2019-08-02 17:04发布

问题:

I am currently taking an introductory course on parallel computing, where class instructor describes MPI as :-

  1. Yet another multi-processing paradigm
  2. A set of libraries
  3. API

I don't understand what MPI exactly is. If it is a set of libraries or an API, then why is it still called a paradigm ? Which of the above 3 terms most accurately explain what MPI is ?

回答1:

MPI is a way of working with data (typically arrays of plain types like int or double). It is an API (as in interface, not library) which describes functions you can use to transmit and receive data in specific "patterns" amongst a set of compute nodes (potentially on separate machines).

It also describes a way to launch programs which are connected to each other in some way which supports the above operations, and a way for each launched process to know how many peers it has.

There are multiple competing implementations of MPI, such as OpenMPI and MPICH. If you write your program against the MPI specifications, you can use it with any implementation of MPI available on your compute platform. But all the processes in one job must use the same implementation of MPI, because it is an API only, and does not promise interoperability between implementations at runtime.

The reason why MPI might be called a paradigm is that it requires thinking about distributed computing in a specific way which is not familiar to most programmers. Once you have used it for even a single "real" program, you will see that it demands a way of thinking about data structures and algorithms which differs from programming with, say, sockets or message queues.



回答2:

At its purest, MPI is a standard that outlines a model (or indeed "paradigm", if you will) for parallel computation via message passing. Said standard also defines an API for certain languages (C and Fortran, for historical reasons). Third parties are then free to write libraries that implement said standard in whatever language they like, using whatever implementation details they like (as long as it doesn't conflict with the standard).

It's a little similar to the distinction between, say, the C++ language and C++ compilers. The language is just a set of rules that specify what the program's behaviour should be given certain inputs. A compiler is any program that can take input and produce all the results that the language specification requires. Likewise, an MPI implementation is any library that you can link to a project and use to produce all the results that the MPI standard prescribes.



回答3:

API

This is the best fit for MPI. Most importantly, MPI is the Message Passing Interface Standard. The bulk of MPI is the definition of an (application programming) interface, and it's semantics. However, MPI goes beyond a pure API, for example it also specifies the startup of MPI applications.

Multi-processing paradigm

There are a couple of paradigms associated with MPI. Message passing as a form of communication. SPMD single program, multiple data as the most commonly used executing scheme. Although MPI also supports MPMD (multiple program, multiple data).

A set of libraries

MPI implementations, such as OpenMPI and MPICH are a libraries. However, they go beyond simple libraries. For example they provide compiler wrappers to simplify compilation and linking and a complex startup infrastructure.



标签: mpi