CUDA: Scatter communication pattern

2020-07-13 06:49发布

I am learning CUDA from the Udacity's course on parallel programming. In a quiz, they have a given a problem of sorting a pre-ranked variable(player's height). Since, it is a one-one correspondence between input and output array, should it not be a Map communication pattern instead of a Scatter? enter image description here

3条回答
爷的心禁止访问
2楼-- · 2020-07-13 07:06

CUDA makes no canonical definition of these terms, that I know of. Therefore my answer is merely a suggestion of how it might be or have been interpreted.

"Since, it is a one-one correspondence between input and output array"

This statement doesn't appear to be supported by the diagram, which shows gaps in the output array, which have no corresponding input point associated with them.

If a smaller set of values are distributed into a larger array (with resultant gaps in the output array, therefore, in which no input value corresponds to the gap location(s)), then a scatter might be used to describe that operation. Both scatters and maps have maps which describe where the input values go, but it might be that the instructor has defined scatter and map in such a way as to differentiate between these two cases, such as the following plausible definitions:

Scatter: one-to-one relationship from input to output (ie. unidirectional relationship). Every input location has a corresponding output location, but not every output location has a corresponding input location.

Map: one-to-one relationship between input and output (ie. bidirectional relationship). Every input location has a corresponding output location, and every output location has a corresponding input location.

Gather: one-to-one relationship from output to input (ie. unidirection relationship). Every output location has a corresponding input location, but not every input location has a corresponding output location.

查看更多
家丑人穷心不美
3楼-- · 2020-07-13 07:22

The definition of each communication pattern (map, scatter, gather, etc.) varies slightly from one language/environment/context to another, but since I have followed that same Udacity course I'll try to explain that term as I understand it in the context of the course:

The Map operation calculates each output element as a function of its corresponding input element, i.e.:

output[tid] = foo(input[tid]);

The Gather pattern calculates each output element as a function of one or more (usually more) input elements, not necessarily the corresponding one (typically these are elements from a neighborhood). For example:

output[tid] = (input[tid-1] + input[tid+1]) / 2;

Lastly, the Scatter operation has each input element contribute to one or more (again, usually more) output elements. For instance,

atomicAdd( &(output[tid-1]), input[tid]);
atomicAdd( &(output[tid]),   input[tid]);
atomicAdd( &(output[tid+1]), input[tid]);

The example given in the question is clearly not a Map, because each output is calculated from an input at a different location.

Also, it is hard to see how the same example can be a scatter, because each input element only causes one write to the output, but it is indeed a scatter because each input causes a write to an output whose location is determined by the input.

In other words, each CUDA thread processes an input element at the location associated with its tid(thread ID number), and calculates where to write the result. More usually a scatter would write on several places instead of only one, so this is a particular case that might as well be named differently.

查看更多
欢心
4楼-- · 2020-07-13 07:30

Each player has 3 properties (name, height, rank). So I think scatter is correct, because we should consider these three things to make output.

If player has only one property like rank, then Map is correct I think.

reference: Parallel Communication Patterns Recap in this lecture

reference: map/reduce/gather/scatter with image

查看更多
登录 后发表回答