Is Interlocked.Increment(ref x)
faster or slower than x++
for ints and longs on various platforms?
相关问题
- Generic Generics in Managed C++
- How to Debug/Register a Permanent WMI Event Which
- 'System.Threading.ThreadAbortException' in
- Faster loop: foreach vs some (performance of jsper
- Bulk update SQL Server C#
It will always be slower because it has to perform a CPU bus lock vs just updating a register. However modern CPUs achieve near register performance so it's negligible even in real-time processing.
It is slower since it forces the action to occur atomically and it acts as a memory barrier, eliminating the processor's ability to re-order memory accesses around the instruction.
You should be using Interlocked.Increment when you want the action to be atomic on state that can be shared between threads - it's not intended to be a full replacement for x++.
Think about it for a moment, and you'll realize an
Increment
call cannot be any faster than a simple application of the increment operator. If it were, then the compiler's implementation of the increment operator would callIncrement
internally, and they'd perform the same.But, as you can see by testing it for yourself, they don't perform the same.
The two options have different purposes. Use the increment operator generally. Use
Increment
when you need the operation to be atomic and you're sure all other users of that variable are also using interlocked operations. (If they're not all cooperating, then it doesn't really help.)In our experience the InterlockedIncrement() et al on Windows are quite significant impacts. In one sample case we were able to eliminate the interlock and use ++/-- instead. This alone reduced run time from 140 seconds to 110 seconds. My analysis is that the interlock forces a memory roundtrip (otherwise how could other cores see it?). An L1 cache read/write is around 10 clock cycles, but a memory read/write more like 100.
In this sample case, I estimated the number of increment/decrement operations at about 1 billion. So on a 2Ghz CPU this is something like 5 seconds for the ++/--, and 50 seconds for the interlock. Spread the difference across several threads, and its close to 30 seconds.
It's slower. However, it's the most performant general way I know of for achieving thread safety on scalar variables.
My perfomance test:
volatile: 65,174,400
lock: 62,428,600
interlocked: 113,248,900