Adressing vector elements in C / openCL

2019-06-14 10:32发布

问题:

I'm writing an openCL Kernel in pyopenCL, where I want to address vector elements.

In plain C, the result I want to have is:

int i = 0;
float *vec = (float*)maalloc(sizeof(float)*4);
for (i=0;i<4;i++)
{
    vec[i]=2*i;
}

In openCL, the elements of a vector are accessed in a "pythonic" point-syntax style.

float4 vec = (float4)(0);
for (i=0;i<4,i++)
{
    vec.si = 2*i;
/*obviously doesn't work*/
}

So vec[2] becomes vec.s2 in openCL, so it is no longer straightforward to access the element with a variable. How can I access the vector element using a variable nevertheless?

回答1:

OpenCL supports C for both the host side components and the device side components (kernels), so you can write a kernel which is almost exactly the same as your first example, using a float array. The kernel might look like the following:

__kernel void vectorAddition(__global float* vec) {
  // Get the global thread id in x dimension(eliminates loop)
  size_t index = get_global_id(0);

  vec[index] = 2.0f * index;
}

Then you can specify the number of threads to use so that this is done to each element of the array (make the same number of threads as elements in the array).

OpenCL does allow access using the dot notation, but this is to access the elements of vector data types. Vector data types can provide improved performance because the same operation can be done to all the elements in the vector data type at the same time.

For example, float4 is a vector data type which stores four 32 bit floats next to each other to make a 128 bit struct. You can then perform an operation on all 4 of the floats at one time.

For example:

float4 v = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
float4 mult_result = v * 2;

Which takes a single instruction to do the four multiplications simultaneously. mult_result then has the value {2.0f, 4.0f, 6.0f, 8.0f}.

The dot notation can then be used to access the components of the float4 variables, for example:

float a = v.x;           // a = 1.0f
float b = mult_result.y; // b = 4.0f

Here is a summary of the vector data types: Vector Data Type Overview.