On the Intel x86 platform running Linux, in C/C++, how can I tell the OS and the hardware to store a value (such as a uint32) in L1/L2 cache, and not in system memory? For example, let's say either for security or performance reasons, I don't want to store a 32-bit key (a 32-bit unsigned int) in DRAM, and instead I would like to store it only in the processor's cache. How can I do this? I'm using Fedora 16 (Linux 3.1 and gcc 4.6.2) on an Intel Xeon processor.
Many thanks in advance for your help!
I don't think you are able to force a variable to be stored in the processor's cache, but you can use the register
keyword to suggest to the compiler that a given variable should be allocated into a CPU register, declaring it like:
register int i;
There are no cpu instructions on x86 (or indeed any platform that I'm aware of) that will allow you to force the CPU to keep something in the L1/L2 cache. Let alone exposing such an extremely low level detail to the higher level languages like C/C++.
Saying you need to do this for "performance" is meaningless without more context of what sort of performance you're looking at. Why is your program so tightly dependent on having access to data in cache alone. Saying you need this for security just seems like bad security design. In either case, you have to provide a lot more detail of what exactly you're trying to do here.
Short answer, you can't - that is not what those caches are for - they are fed from main memory, to speed up access, or allow for advanced techniques like branch prediction and pipelining.
There are ways to ensure the caches are used for certain data, but it will still reside in ram, and in a pre-emptive multitasking operating system, you cannot guarantee that your cache contents will not be blown away through a context switch between any two instructions, except by 'stopping the world', or low level atomic operations, but they are generally for very, very, very short sequences of instructions that simply cannot not be interrupted, like increment and fetch for spinlocks, not processing cryptographic algorithms in one go.
You can't use cache directly but you can use hardware registers for integers, and they are faster.
If you really want performance, a variable is better off in a CPU register.
If you cannot use a register, for example because you need to share the same value across different threads or cores (multicore is getting common now!), you need to store the variable into memory.
As already mentioned, you cannot force some memory into the cache using a call or keyword.
However, caches aren't entirely stupid: if you memory block is used often enough you shouldn't have a problem to keep it in the cache.
Keep in mind that if you happen to write to this memory place a lot from different cores, you're going to strain the cache coherency blocks in the processor because they need to make sure that all the caches and the actual memory below are kept in sync.
Put simply, this would reduce overall performance of the CPU.
Note that the opposite (do not cache) does exist as a property you can assign to parts of your heap memory.