Intel CPU Cache Policy

2019-02-18 17:29发布

I have a laptop with an Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz processor. I'm on Ubuntu 12.04 (x86_64) and I'm trying to find some info about my processor.

I was able to find most of the information I was looking for using

cat /proc/cpuinfo

and

lscpu

What I want to also find out is the cache policy that is used on each cache level. Is it write back or write through?

Is there any tool that I can use to find out such info?

Thanks in advance.

2条回答
太酷不给撩
2楼-- · 2019-02-18 17:49

This is not something you can query from CPUID or such, nor can you configure your CPU to do one or the other, thus there exists no tool for querying. What you can query is the cache associativity, the cache line size, and the cache size, for example via /proc/cpuinfo.

All Intel-compatible CPUs during the last one/two decades used a write-back strategy for caches (which presumes fetching a cache line first to allow partial writes). Of course that's the theory, reality is slighly more complex than that.

Virtually all processors (your model included) have one or several forms of write combining (or fill buffers as Intel calls it since Merom), and all but the most antique Intel-compatible CPUs support uncached writes from SSE registers (which again uses a form of write-combining). And then of course, there are things like on-chip cache coherence protocols and snoop filtering and other mechanisms to ensure cache coherency both between cores of one processor and between different processors in a multi-processor system.
Nevertheless -- the general cache policy is still write-back.

查看更多
冷血范
3楼-- · 2019-02-18 17:59

David Kanter's very nice Intel Sandybridge writeup covers the memory subsystem and cache architecture: L1D is the usual-for-Intel write-back, and the per-core L2 is also write-back. So is L3 (which is a large inclusive cache shared by all cores on the chip).


AMD takes a very different approach: Their L1 cache is write-through, but with a tiny 4k write-combining-cache. Constantly rewriting a buffer larger than 4k on AMD will bottleneck on the (slow) L2 instead of L1.

One of the posters in that thread on Agner's blog claims that BD's L2 is also write-through, but Paul Clayton's comments on this answer disagrees. (I'm inclined to believe Paul.)

AMD Ryzen fortunately uses a normal write-back 32kiB 8-way L1D, with private write-back 512kiB L2. L3 is a shared 8MB victim cache. It's write-back, but victim-cache means data only enters it when evicted from L1/L2, not directly for loads / prefetches. Each core-cluster (CCX module) of 4 cores has its own 8MB L3, and latency/bandwidth between cores in different clusters is bad.

There's much more to say about a cache hierarchy than just write-back vs. write-through, although most of the differences don't matter for single-threaded programs. (Unless the OS's process scheduler moves them between clusters on Ryzen, in which case it's bad.)


On my SnB system:

sudo dmidecode

produces output which includes:

Handle 0x0005, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L1-Cache
        Configuration: Enabled, Not Socketed, Level 1
        Operational Mode: Write Back
        Location: Internal
        Installed Size: 32 kB
        Maximum Size: 32 kB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: None
        System Type: Unified
        Associativity: 8-way Set-associative

So the fact that the cache is Write-Back is at least in the BIOS, if that's trustworthy. I'm curious what it shows on an AMD CPU, or if BIOS writers tend to just "make something up" and sometimes put the wrong value there.

As this question points out, info for L2 is kinda bogus: it totals the private 256k-per-core L2:

Handle 0x0006, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L2-Cache
        Configuration: Enabled, Not Socketed, Level 2
        Operational Mode: Varies With Memory Address
        Location: Internal
        Installed Size: 1024 kB
        Maximum Size: 1024 kB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: None
        System Type: Unified
        Associativity: 8-way Set-associative

Handle 0x0007, DMI type 7, 19 bytes
Cache Information
        Socket Designation: L3-Cache
        Configuration: Enabled, Not Socketed, Level 3
        Operational Mode: Unknown
        Location: Internal
        Installed Size: 6144 kB
        Maximum Size: 6144 kB
        Supported SRAM Types:
                Other
        Installed SRAM Type: Other
        Speed: Unknown
        Error Correction Type: None
        System Type: Unified
        Associativity: Other

This is on an i5-2500k (quad core SnB with 6MiB of L3)

查看更多
登录 后发表回答