What is the difference between a Cluster and MPP supercomputer architecture?
相关问题
- Using parameters from BASH file name as arguments
- Unable to start application on websphere 8.5, but
- How best to file lock in Java cluster
- Failing K8s rabbitmq-peer-discovery-k8s clustering
- Pass request to specific forked node instance
相关文章
- Stacktrace does not print in Glassfish 4.1 Cluster
- Compute dissimilarity matrix for large data
- Connection to Redis cluster failed
- Does Apache Mesos recognize GPU cores?
- Using cluster in a Node module
- Socket.io, cluster, express and sync events
- Difference between “SOCK”, “PVM”, “MPI”, and “NWS”
- how to specify error log file and output file in q
The top500 list uses a slightly different distinction between an MPP and a cluster, as explained in Dongarra et al. paper:
Compared to a cluster, a modern MPP (such as the IBM Blue Gene) is more tightly-integrated: individual nodes cannot run on their own and they are connected by a custom network (like a multidimensional torus). But, similarly to a cluster, there is no single, shared memory spanning all the nodes (note: an MPP might be hierarchical and shared memory might be used inside a single node (NUMA), or between a handful of nodes).
I'd be thus extremely careful to use terms SIMD and MIMD in this context as they usually describe shared memory architectures (SMP).
Update:
Dongarra et al. link
Update: MPP can have nodes that use shared memory internally; but the whole MPP memory is not shared.
In a cluster, each machine is largely independent of the others in terms of memory, disk, etc. They are interconnected using some variation on normal networking. The cluster exists mostly in the mind of the programmer and how s/he chooses to distribute the work.
In a Massively Parallel Processor, there really is only one machine with thousands of CPUs tightly interconnected. MPPs have exotic memory architectures to allow extremely high speed exchange of intermediate results with neighboring processors.
The major variants are SIMD (Single Instruction, Multiple Data) and MIMD (Multiple Instruction, Multiple Data). In a SIMD system, every processor is executing the same instruction at the same time, only on different bits of memory. Essentially, there is only one Program Counter. In a MIMD machine, each CPU has it's own PC.
MPPs can be a bitch to program and are of use only on algorithms that are embarrassingly parallel (that's actually what they call it). However, if you have such a problem, then an MPP can be shockingly fast. They are also incredibly expensive.
I've searched in a lot of HPC literature and couldn't find a concrete definition of MPP. There is quite a concesus over a cluster consisting of multiple interconnected regular personal computers or workstations, usually coupled with standard technologies (like Ethernet or open-source operating systems). The term MPP is usually applied to more propietary approches for building distributed-memory computers, usually having propietary technologies.
For example: Tianhe-2 is considered a cluster because it uses x86-64 nodes and a regular operating system (Kylin Linux). Sunway TaihuLight is considered an MPP because its nodes have its particular architecture, SW26010, and work over his own operating system called Sunway Raise OS.
The most concrete explanation of this matter I found was in Sourcebook of Parallel Computing (Dongarra et al.):
A cluster is a bunch of machines, normally usually Ethernet interconnect (read: network), each running it's own and separate copy of an OS which happen to serve a single purpose.
An MPP supercomputer usually implies a faster propitiatory very fast interconnect (e.g. SGI NUMALink) that supports either Distributed Shared Memory (run processes on different MPP nodes that use shared memory over the fast interconnect to share data as if they were running on a single computer) or even a Single System Image (a single instance of an operating system, mostly Linux, running on all the nodes at the same time as if on a single machine - e.g. "ps aux" on any node will show you all the processes running on the MPP).
As you can see the definition is quite fluid, it's more a question of scale rather than clear cut differences.