The C# spec states in section 5.5 that reads and writes on certain types (namely bool
, char
, byte
, sbyte
, short
, ushort
, uint
, int
, float
, and reference types) are guaranteed to be atomic.
This has piqued my interest. How can you do that? I mean, my lowly personal experience only showed me to lock variables or to use barriers if I wanted reads and writes to look atomic; that would be a performance killer if it had to be done for every single read/write. And yet C# does something with a similar effect.
Perhaps other languages (like Java) do it. I seriously don't know. My question isn't really intended to be language-specific, it's just that I know C# does it.
I understand that it might have to deal with certain specific processor instructions, and may not be usable in C/C++. However, I'd still like to know how it works.
[EDIT] To tell the truth, I believed that reads and writes could be non-atomic in certain conditions, like a CPU could access a memory location while another CPU is writing there. Does this only happen when the CPU can't treat all the object at once, like because it's too big or because the memory is not aligned on the proper boundary?
The reason those types have guaranteed atomicity is because they are all 32 bits or smaller. Since .NET only runs on 32 and 64 bit operating systems, the processor architecture can read and write the entire value in a single operation. This is in contrast to say, an Int64 on a 32 bit platform which must be read and written using two 32 bit operations.
I'm not really a hardware guy so I apologize if my terminology makes me sound like a buffoon but it's the basic idea.
It is fairly cheap to implement the atomicity guarantee on x86 and x64 cores since the CLR only promises atomicity for variables that are 32-bit or smaller. All that's required is that the variable is properly aligned and doesn't straddle a cache line. The JIT compiler ensures this by allocating local variables on a 4-byte aligned stack offset. The GC heap manager does the same for heap allocations.
Notable is that the CLR guarantee is not a very good one. The alignment promise is not good enough to write code that's consistently performant for arrays of doubles. Very nicely demonstrated in this thread. Interop with machine code that uses SIMD instructions is also very difficult for this reason.
On x86 reads and writes are atomic anyway. It's supported at the hardware level. This however does not mean that operations like addition and multiplication are atomic; they require a load, compute, then store, which means they can interfere. That's where the lock prefix comes in.
You mentioned locking and memory barriers; they don't have anything to do with reads and writes being atomic. There is no way on x86 with or without using memory barriers that you're going to see a half-written 32-bit value.
Yes, C# and Java guarantee that loads and stores of some primitive types are atomic, like you say. This is cheap because the processors capable of running .NET or the JVM do guarantee that loads and stores of suitably aligned primitive types are atomic.
Now, what neither C# nor Java nor the processors they run on guarantee, and which is expensive, is issuing memory barriers so that those variables can be used for synchronization in a multi-threaded program. However, in Java and C# you can mark your variable with the "volatile" attribute, in which case the compiler takes care of issuing the appropriate memory barriers.
You can't. Even going all the way down to assembly language you have to use special LOCK opcodes in order to guarantee that another core or even process isn't going to come around and wipe out all your hard work.