OpenCL struct values correct on CPU but not on GPU

2019-07-21 13:11发布

I do have a struct in a file wich is included by the host code and the kernel

typedef struct {
    float x, y, z,
          dir_x, dir_y, dir_z;
    int     radius;
} WorklistStruct;

I'm building this struct in my c++ host code and passing it via a buffer to the OpenCL kernel.

If I'm choosing an CPU device for computation I will get the following result:

 printf ( "item:[%f,%f,%f][%f,%f,%f]%d,%d\n", item.x, item.y, item.z, item.dir_x, item.dir_y,
                 item.dir_z , item.radius ,sizeof(float));

Host:

item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4

Device (CPU):

item:[20.169043,7.000000,34.933712][0.000000,-3.000000,0.000000]1,4

And if I choose a GPU device (AMD) for computation weird things are happening:

Host:

item:[58.406261,57.786015,58.137501][2.000000,2.000000,2.000000]2,4

Device (GPU):

item:[58.406261,2.000000,0.000000][0.000000,0.000000,0.000000]0,0

Notable is that the sizeof(float) is garbage on the gpu.

I assume there is a problem with the layouts of floats on different devices.

Note: the struct is contained in an array of structs of this type and every struct in this array is garbage on GPU

Anyone does have an idea why this is the case and how I can predict this?

EDIT I added an %d at the and and replaced it by an 1, the result is:1065353216

EDIT: here two structs wich I'm using

typedef struct {
      float x, y, z,//base coordinates 
      dir_x, dir_y, dir_z;//directio
      int     radius;//radius
} WorklistStruct;

typedef struct {
    float base_x, base_y, base_z; //base point 
    float radius;//radius 
    float dir_x, dir_y, dir_z; //initial direction
} ReturnStruct;

I tested some other things, it looks like a problem with printf. The values seems to be right. I passed the arguments to the return struct, read them and these values were correct.

I don't want to post all of the related code, this would be a few hundred lines. If noone has an idea I would compress this a bit.

Ah, and for printing I'm using #pragma OPENCL EXTENSION cl_amd_printf : enable.

Edit: Looks really like a problem with printf. I simply don't use it anymore.

2条回答
Fickle 薄情
2楼-- · 2019-07-21 13:36

There is a simple method to check what happens:

1 - Create host-side data & initialize it:

int num_points = 128;

std::vector<WorklistStruct> works(num_points);
std::vector<ReturnStruct> returns(num_points);

for(WorklistStruct &work : works){
    work = InitializeItSomehow();
    std::cout << work.x << " " << work.y << " " << work.z << std::endl;
    std::cout << work.radius << std::endl;
}

// Same stuff with returns
...

2 - Create Device-side buffers using COPY_HOST_PTR flag, map it & check data consistency:

cl::Buffer dev_works(..., COPY_HOST_PTR, (void*)&works[0]);
cl::Buffer dev_rets(..., COPY_HOST_PTR, (void*)&returns[0]);

// Then map it to check data
WorklistStruct *mapped_works = dev_works.Map(...);
ReturnStruct *mapped_rets = dev_rets.Map(...);

// Output values & unmap buffers
...

3 - Check data consistency on Device side as you did previously.

Also, make sure that code (presumably - header), which is included both by kernel & host-side code is pure OpenCL C (AMD compiler sometimes can "swallow" some errors) and that you've imported directory for includes searching, when building OpenCL kernel ("-I" flag at clBuildProgramm stage)

Edited: At every step, please collect return codes (or catch exceptions). Beside that, "-Werror" flag at clBuildProgramm stage can also be helpfull.

查看更多
The star\"
3楼-- · 2019-07-21 13:43

It looks like I used the wrong OpenCL headers for compiling. If I try the code on the Intel platform(OpenCL 1.2) everything is fine. But on my AMD platform (OpenCL 1.1) I get weird values.

I will try other headers.

查看更多
登录 后发表回答