pyCUDA vs C performance differences?

I'm new to CUDA programming and I was wondering how the performance of pyCUDA is compared to programs implemented in plain C. Will the performance be roughly the same? Are there any bottle necks that I should be aware of?

EDIT: I obviously tried to google this issue first, and was surprised to not find any information. i.e. I would have excepted that the pyCUDA people have this question answered in their FAQ.

标签： c cuda pycuda

4条回答

走好不送

2楼-- · 2020-05-20 08:21

If you're using CUDA -- whether directly through C or with pyCUDA -- all the heavy numerical work you're doing is done in kernels that execute on the gpu and are written in CUDA C (directly by you, or indirectly with elementwise kernels). So there should be no real difference in performance in those parts of your code.

Now, the initialization of arrays, and any post-work analysis, will be done in python (probably with numpy) if you use pyCUDA, and that generally will be significantly slower than doing it directly in a compiled language (though if you've built your numpy/scipy in such a way that it links directly to high-performance libraries, then those calls at least would perform the same in either language). But hopefully, your initialization and finalization are small fractions of the total amount of work you have to do, so that even if there is significant overhead there, it still hopefully won't have a huge impact on overall runtime.

And in fact if it turns out that the python parts of the computation does hurt your application's performance, starting out doing your development in pyCUDA may still be an excellent way to get started, as the development is significantly easier, and you can always re-implement those parts of the code that are too slow in Python in straight C, and call those from python, gaining some of the best of both worlds.

0人赞添加讨论(0) 举报

\"骚年 ilove

3楼-- · 2020-05-20 08:27

I've been using pyCUDA for a little while an I like prototyping with it because it speeds up the process of turning an idea into working code.

With pyCUDA you will be writing the CUDA kernels using C++, and it's CUDA, so there shouldn't be a difference in performance of running that code. But there will be a difference in the performance of the code you write in Python to setup or use the results of the pyCUDA kernel vs the one you write in C.

0人赞添加讨论(0) 举报

太酷不给撩

4楼-- · 2020-05-20 08:37

Make sure you're using -O3 optimizations there and use nvprof/nvvp to profile your kernels if you're using PyCUDA and you want to get high performance. If you want to use Cuda from Python, PyCUDA is probably THE choice. Because interfacing C++/Cuda code via Python is just hell otherwise. You have to write a hell lot of ugly wrappers. And for numpy integration even more hardcore wrap-up code would be necessary.

0人赞添加讨论(0) 举报

我命由我不由天

5楼-- · 2020-05-20 08:39

If you're wondering about performance differences by using pyCUDA in different ways, see SimpleSpeedTest.py included in the pyCUDA Wiki examples. It benchmarks the same task completed by a CUDA C kernel encapsulated in pyCUDA, and by several abstractions created by pyCUDA's designer. There's a performance difference.

0人赞添加讨论(0) 举报

pyCUDA vs C performance differences?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间