Using “String” in openCl Kernel

2019-08-03 01:45发布

I have a question about OpenCl programming. The scenario is : I have a list of words taken from a file with different length and I have to pass this list to OpenCl Kernel. I've tried to use a struct compposed by an array of char that contains the word and an int that contains the size. But this kind of solution doesn't work because, in the Kernel, I must create a new array with the size indicated in the struct, but the Kernel doesn't like arrays of variable size. There is a way to implement this solution (I mean creating one array per thread of different size)? If there is no solution in this way, how can I do? Thank you :)

EDIT : This is the example code.. I hope it clarify the things

typedef struct word{
    char word[100];
    int len;
}word_t;
__kernel void test(__global word_t *data,  __global res_t *result)
{
   size_t id=get_global_id(0);
   int size=0;
   size=data[id].len;
   char word[size];
   //working with the word

}

But the clBuildProgram says that I cannot have an array with variable size..

标签: c opencl
1条回答
叛逆
2楼-- · 2019-08-03 02:25

You can't use variable length arrays like this in OpenCL so you could use a fixed maximum length array if you can limit your word length as an easy solution.

#define MAX_WORD_LEN 100
#define WORK_GROUP_SIZE 128

typedef struct word{
    char word[MAX_WORD_LEN];
    int len;
}word_t;

__kernel void test(__global word_t *data,  __global res_t *result)
{
   size_t id=get_global_id(0);
   int size=0;
   size=data[id].len;
   __local char word[MAX_WORD_LEN * WORK_GROUP_SIZE];
   //working with the word
   __local char* pThreadWord = &word[ MAX_WORD_LEN * id];
}

I've put the array in local memory as if you do char word[MAX_WORD_LENGTH] you will almost certainly use up all your registers and spill (i.e. super slow).

If you do have to cope with very long variable length words then you will have to dynamically "allocate" memory in your local memory for each thread with an atomic

__kernel void test(__global word_t *data,  __global res_t *result)
{
   size_t id=get_global_id(0);
   int size=0;
   int size=data[id].len;
   // local memory "allocator"
   __local int offset = 0;
   volatile __local int* pOffset = &offset;
   __local char wordBuffer[BUFFER_SIZE];
   // "allocate" memory
   int myOffset = atomic_add( pOffset, size );
   //working with the word
   __local char* pThreadWord = &wordBuffer[ myOffset ];
}
查看更多
登录 后发表回答