Neutral element for min() and max() in OpenCL redu

2020-06-25 05:18发布

问题:

I'm doing a reduction (finding the minimum and maximum) of a float[] array on a GPU through OpenCL.

I'm loading the some elements from global memory into local memory for each workgroup. When the global size isn't a multiple of the workgroup size, I pad the global size, such that it becomes a multiple of the global size. Work-items past the end of the array put the neutral element of the reduction into local memory.

But what should that neutral element be for max() -- the maximum function? The OpenCL documentation gives MAXFLOAT, HUGE_VALF and INFINITY as very large positive (or unsigned) float values. Does it makes sense to have the neutral element to be -INFINITY for example?

Right now I'm using HUGE_VALF as the neutral element for min(), but the docs also say that HUGE_VALF is used as an error value, so maybe that's a bad idea.

Reduction kernel (Code):

#define NEUTRAL_ELEMENT HUGE_VALF
#define REDUCTION_OP min

__kernel void reduce(__global float* weights,
                     __local float* weights_cached
                    )
{
  unsigned int id = get_global_id(0);

  // Load data
  if (id < {{ point_count }}) {
    weights_cached[get_local_id(0)] = weights[id];
  } else {
    weights_cached[get_local_id(0)] = NEUTRAL_ELEMENT;
  }

  barrier(CLK_LOCAL_MEM_FENCE);

  // Reduce
  for(unsigned int stride = get_local_size(0) / 2; stride >= 1; stride /= 2) {
    if (get_local_id(0) < stride) {
      weights_cached[get_local_id(0)] = REDUCTION_OP(weights_cached[get_local_id(0)], weights_cached[get_local_id(0) + stride]);
    barrier(CLK_LOCAL_MEM_FENCE);
  }

  // Save
  weights[get_group_id(0)] = weights_cached[0];
}

Edit: I actually ended up using fmin() and fmax() together with NAN as the neutral element -- this is basically guaranteed to work according to the OpenCL documentation as the numerical value will always be returned (NAN is only returned if two NAN values are given).

回答1:

Quoting the OpenCL standard:

HUGE_VALF evaluates to +infinity.

So there's no real difference (except for implied intent) between using HUGE_VALF and INFINITY; either will work correctly for a min reduction. In terms of clarity, I have a slight preference for INFINITY, as HUGE_VALF is conceptually intended for edge-case returns, which this isn't.

Similarly, use -INFINITY for a max reduction.

MAX_FLOAT will not behave correctly as a neutral element if your array contains infinities.