My source code is from Heterogeneous Computing with OpenCL Chapter 4 Basic OpenCL Examples > Image Rotation. The book leaves out several critical details.
My major problem is that I don't know how to initialize the array that I supply to their kernel (they don't tell you how). What I have is:
int W = inImage.width();
int H = inImage.height();
float *myImage = new float[W*H];
for(int row = 0; row < H; row++)
for(int col = 0; col < W; col++)
myImage[row*W+col] = col;
which I supply to this kernel:
__kernel void img_rotate(__global float* dest_data, __global float* src_data, int W, int H, float sinTheta, float cosTheta)
{
const int ix = get_global_id(0);
const int iy = get_global_id(1);
float x0 = W/2.0f;
float y0 = H/2.0f;
float xoff = ix-x0;
float yoff = iy-y0;
int xpos = (int)(xoff*cosTheta + yoff*sinTheta + x0);
int ypos = (int)(yoff*cosTheta - xoff*sinTheta + y0);
if(((int)xpos>=0) && ((int)xpos < W) && ((int)ypos>=0) && ((int)ypos<H))
{
dest_data[iy*W+ix] = src_data[ypos*W+xpos];
//dest_data[iy*W+ix] = src_data[iy*W+ix];
}
}
I'm having trouble finding the right value for theta too. An integer would be an appropriate value for theta, right?
float theta = 45; // 45 degrees, right?
float cos_theta = cos(theta);
float sin_theta = sin(theta);
When writing my OpenCL code, I always treat each kernel as reading a 3D set of data, regardless if the data is 1D, 2D, or 3D:
When doing the clEnqueueNDKernelRange(...), just set the dimension to be:
This let's all of my kernels work easily in all dimensions.