Interpreting the verbose output of ptxas, part II

2019-08-03 04:28发布

This question is a continuation of Interpreting the verbose output of ptxas, part I .

When we compile a kernel .ptx file with ptxas -v, or compile it from a .cu file with -ptxas-options=-v, we get a few lines of output such as:

ptxas info    : Compiling entry function 'searchkernel(octree, int*, double, int, double*, double*, double*)' for 'sm_20'
ptxas info    : Function properties for searchkernel(octree, int*, double, int, double*, double*, double*)
    72 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 46 registers, 176 bytes cmem[0], 16 bytes cmem[14]

(same example as in the linked-to question; but with name demangling)

This question regards the last line. A few more examples from other kernels:

ptxas info    : Used 19 registers, 336 bytes cmem[0], 4 bytes cmem[2]
...
ptxas info    : Used 19 registers, 336 bytes cmem[0]
... 
ptxas info    : Used 6 registers, 16 bytes smem, 328 bytes cmem[0]

How do we interpret the information on this line, other than the number of registers used? Specifically:

  • Is cmem short for constant memory?
  • Why are there different categories of cmem, i.e. cmem[0], cmem[2], cmem[14]?
  • smem probably stands for shared memory; is it only static shared memory?
  • Under which conditions does each kind of entry appear on this line?

2条回答
Explosion°爆炸
2楼-- · 2019-08-03 04:43

Collected and reformatted...

Resources on the last ptxas info line:

  • registers - in the register file on every SM (multiprocessor)
  • gmem - Global memory
  • smem - Static Shared memory
  • cmem[N] - Constant memory bank with index N.
    • cmem[0] - Bank reserved for kernel argument and statically-sized constant values
    • cmem[2] - ???
    • cmem[4] - ???
    • cmem[14] - ???

Each of these categories will be shown if the kernel uses any such memory (Registers - probably always shown); thus it is no surprise all the examples show some cmem[0] usage.

You can read a bit more on the CUDA memory hierarchy in Section 2.3 of the Programming Guide and the links there. Also, there's this blog post about static vs dynamic shared memory.

查看更多
\"骚年 ilove
3楼-- · 2019-08-03 04:46

Is cmem short for constant memory?

Yes

Why are there different categories of cmem, i.e. cmem[0], cmem[2], cmem[14]?

They represent different constant memory banks. cmem[0] is the reserved bank for kernel arguments and statically sized constant values.

smem probably stands for shared memory; is it only static shared memory?

It is, and how could it be otherwise.

Under which conditions does each kind of entry appear on this line?

Mostly answered here.

查看更多
登录 后发表回答