Approximate cost to access various caches and main

2019-01-01 08:01发布

Can anyone give me the approximate time (in nanoseconds) to access L1, L2 and L3 caches, as well as main memory on Intel i7 processors?

While this isn't specifically a programming question, knowing these kinds of speed details is neccessary for some low-latency programming challenges.

109条回答
只靠听说
2楼-- · 2019-01-01 08:24
步步皆殇っ
3楼-- · 2019-01-01 08:24
<800> warps ~~ 24000 + 3200 threads ~~ 27200 threads [!!] | ^^^^^^^^|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [!!] | smREGs________________________________________ penalty +400 ~ +800 [GPU_CLKs] latency ( maskable by 400~800 WARPs ) on <Compile-time>-designed spillover(s) to locMEM__ | +350 ~ +700 [ns] @1147 MHz FERMI ^^^^^^^^ | | ^^^^^^^^ | +5 [ns] @ 200 MHz FPGA. . . . . . Xilinx/Zync Z7020/FPGA massive-parallel streamline-computing mode ev. PicoBlazer softCPU | | ^^^^^^^^ | ~ +20 [ns] @1147 MHz FERMI ^^^^^^^^ | SM-REGISTERs/thread: max 63 for CC-2.x -with only about +22 [GPU_CLKs] latency ( maskable by 22-WARPs ) to hide on [REGISTER DEPENDENCY] when arithmetic result is to be served from previous [INSTR] [G]:10.4, Page-46 | max 63 for CC-3.0 - about +11 [GPU_CLKs] latency ( maskable by 44-WARPs ) [B]:5.2.3, Page-73 | max 128 for CC-1.x PAR --
查看更多
美炸的是我
4楼-- · 2019-01-01 08:25
查无此人
5楼-- · 2019-01-01 08:25
与风俱净
6楼-- · 2019-01-01 08:25
只若初见
7楼-- · 2019-01-01 08:26
登录 后发表回答