Dynamic Allocation of Constant memory in CUDA

2019-04-25 11:27发布

I'm trying to take advantage of the constant memory, but I'm having a hard time figuring out how to nest arrays. What I have is an array of data that has counts for internal data but those are different for each entry. So based around the following simplified code I have two problems. First I don't know how to allocate the data pointed to by the members of my data structure. Second, since I can't use cudaGetSymbolAddress for constant memory I'm not sure if I can just pass the global pointer (which you cannot do with plain __device__ memory).


struct __align(16)__ data{
int nFiles;
int nNames;
int* files;
int* names;
};

__device__ __constant__ data *mydata;

__host__ void initMemory(...)
{
    cudaMalloc( (void **) &(mydata), sizeof(data)*dynamicsize );
    for(int i=; i lessthan dynamicsize; i++)
    {
        cudaMemcpyToSymbol(mydata, &(nFiles[i]), sizeof(int), sizeof(data)*i, cudaMemcpyHostToDevice);
        //...
        //Problem 1: Allocate & Set mydata[i].files
    }
}

__global__ void myKernel(data *constDataPtr)
{
    //Problem 2: Access constDataPtr[n].files, etc
}

int main()
{
    //...
    myKernel grid, threads (mydata);
}

Thanks for any help offered. :-)

4条回答
Animai°情兽
2楼-- · 2019-04-25 12:09

Why don't you just use the so-called "packed" data representation? This approach allows you to place all the data you need into one-dimension byte array. E.g., if you need to store

struct data
{
    int nFiles;
    int nNames;
    int* files;
    int* names;
}

You can just store this data in the array this way:

[struct data (7*4=28 bytes)
    [int nFiles=3 (4 bytes)]
    [int nNames=2 (4 bytes)]
    [file0 (4 bytes)]
    [file1 (4 bytes)]
    [file2 (4 bytes)]
    [name0 (4 bytes)]
    [name1 (4 bytes)]
]
查看更多
老娘就宠你
4楼-- · 2019-04-25 12:17

I think constant memory is 64K and you cannot allocate it dynamically using CudaMalloc. It has to be declared constant, say,

__device__ __constant__ data mydata[100];

Similarly you also don't need to free it. Also, you shouldn't pass the reference to it via pointer, just access it as a global variable. I tried doing a similar thing and it gave me segfault (in devicemu).

查看更多
\"骚年 ilove
5楼-- · 2019-04-25 12:31

No, you cant do that.

Constant memory (64KB max) can only be hard-coded before compilation.

However you can assign texture memory on the fly which is also cached on the Device.

查看更多
登录 后发表回答