I have created a numpy array of float32s with shape (64, 128)
, and I want to send it to the GPU. How do I do that? What arguments should my kernel function accept? float** myArray
?
I have tried directly sending the array as it is to the GPU, but pycuda complains that objects are being accessed...
Two dimensional arrays in numpy/PyCUDA are stored in pitched linear memory in row major order by default. So you only need to have a kernel something like this:
to access a numpy
ndarray
or PyCUDAgpuarray
passed by reference to the kernel from Python.