Can someone give a clear explanation of how the new and delete keywords would behave if called from __device__
or __global__
code in CUDA 4.2?
Where does the memory get allocated, if its on the device is it local or global?
It terms of context of the problem I am trying to create neural networks on the GPU, I want a linked representation (Like a linked list, but each neuron stores a linked list of connections that hold weights, and pointers to the other neurons), I know I could allocate using cudaMalloc
before the kernel launch but I want the kernel to control how and when the networks are created.
Thanks!
C++
new
anddelete
operate on device heap memory. The device allows for a portion of the global (i.e. on-board) memory to be allocated in this fashion.new
anddelete
work in a similar fashion to devicemalloc
andfree
.You can adjust the amount of device global memory available for the heap using a runtime API call.
You may also be interested in the C++ new/delete sample code.
CC 2.0 or greater is required for these capabilities.