Why does Linux perf use event l1d.replacement for

2019-07-01 16:49发布

On Intel x86, Linux uses the event l1d.replacements to implement its L1-dcache-load-misses event.

This event is defined as follows:

Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.

Perhaps naively, I would have expected perf to use something like mem_load_retired.l1_miss, which supports PEBS and is defined as:

Counts retired load instructions with at least one uop that missed in the L1 cache. (Supports PEBS)

The event values are usually not exactly very close, and sometimes they vary wildly. For example:

$ocperf stat -e mem_inst_retired.all_loads,l1d.replacement,mem_load_retired.l1_hit,mem_load_retired.l1_miss,mem_load_retired_fb_hit head -c100M /dev/urandom > /dev/null 

 Performance counter stats for 'head -c100M /dev/urandom':

       445,662,315      mem_inst_retired_all_loads                                   
            92,968      l1d_replacement                                             
       443,864,439      mem_load_retired_l1_hit                                     
         1,694,671      mem_load_retired_l1_miss                                    
            28,080      mem_load_retired_fb_hit                                     

There are more than 17 times more "L1 misses" as measured by mem_load_retired.l1_miss as compared to l1d.replacement. Conversely, you can also find examples where l1d.replacement is much higher than the mem_load_retired counters.

What exactly is l1d.replacement measuring, why was it chosen in the kernel, and is it a better proxy for L1 d-cache misses than mem_load_retired.l1_miss?

0条回答
登录 后发表回答