CUDA - Implementing Device Hash Map?

2019-02-05 03:01发布

Does anyone have any experience implementing a hash map on a CUDA Device? Specifically, I'm wondering how one might go about allocating memory on the Device and copying the result back to the Host, or whether there are any useful libraries that can facilitate this task.

It seems like I would need to know the maximum size of the hash map a priori in order to allocate Device memory. All my previous CUDA endeavors have used arrays and memcpys and therefore been fairly straightforward.

Any insight into this problem are appreciated. Thanks.

3条回答
beautiful°
2楼-- · 2019-02-05 03:04

AFAIK, the hash table given in "Cuda by Example" does not perform too well. Currently, I believe, the fastest hash table on CUDA is given in Dan Alcantara's PhD dissertation. Look at chapter 6.

查看更多
Juvenile、少年°
3楼-- · 2019-02-05 03:08

I recall someone developed a straightforward hash map implementation on top of thrust. There is some code for it here, although whether it works with current thrust releases is something I don't know. It might at least give you some ideas.

查看更多
放荡不羁爱自由
4楼-- · 2019-02-05 03:12

There is a GPU Hash Table implementation presented in "CUDA by example", from Jason Sanders and Edward Kandrot.

Fortunately, you can get information on this book and download the examples source code freely on this page:
http://developer.nvidia.com/object/cuda-by-example.html

In this implementation, the table is pre-allocated on CPU and safe multithreaded access is ensured by a lock function based upon the atomic function atomicCAS (Compare And Swap).

Moreover, newer hardware generation (from 2.0) combined with CUDA >= 4.0 are supposed to be able to use directly new/delete operators on the GPU ( http://developer.nvidia.com/object/cuda_4_0_RC_downloads.html?utm_source=http://forums.nvidia.com&utm_medium=http://forums.nvidia.com&utm_term=Developers&utm_content=Developers&utm_campaign=CUDA4 ), which could serve your implementation. I haven't tested these features yet.

查看更多
登录 后发表回答