When different variables are inside the same cache line, you can experience False Sharing, which means that even if two different threads (running on different cores) are accessing two different variables, if those two variables reside in the same cache line, you will have performance hit, as each time cache coherence will be triggered.
Now say those variables are atomic variables (By atomic I mean variables which introduce a memory fence, such as the atomic<t>
of C++), will false sharing matter there, or it does not matter if atomic variables are in the same cache line or not, as supposedly they will introduce cache coherence anyway. In other words, will putting atomic variables in the same cache line make application slower than not putting them in the same cache line?
A clarification: for negative consequences at least some accesses to "falsely shared" variables should be writes. If writes are rare, performance impact of false sharing is rather negligible; the more writes (and so cache line invalidate messages) the worse performance.
Even with atomics, cache line sharing (either false or true) still matters. Look for some evidence here: http://www.1024cores.net/home/lock-free-algorithms/first-things-first. Thus, the answer is - yes, placing atomic variables used by different threads to the same cache line may make application slower compared to placing them to two different lines. However, I think it will be mostly unnoticed, unless the app spends a significant portion of its time updating these atomic variables.
If you use atomic variables with the strongest consistency requirements, a full memory barrier, the effect of false sharing will probably not be noticeable. For such an access the performance of an atomic operation is basically limited by the memory access latency. So things are slow anyhow, I don't think they would get much slower in the presence of false sharing.
If you have other less intrusive memory orderings the performance hit by the atomics itself may be less, and so the impact of false sharing might be significant.
In total, I would first look at the performance of the atomic operation itself before worrying about false sharing for such operations.
will putting atomic variables in the same cache line make application slower than not putting them in the same cache line?
False sharing of "atomic" variables could lead to performance problems (whether or not it will lead to such problems depends on a lot of things).
Let's say you have two cores, A
and B
, and each operates on its own variable. Let's call these variables a
and b
respectively.
A
has a
in its cache, and B
has b
in its cache.
Consider what happens when A
increments a
.
- if
a
and b
share a cache line, B
's copy of b
will get invalidated, and its next access to b
will incur a cache miss.
- if
a
and b
don't share a cache line, there's no impact on B
as far as its cached copy of b
is concerned.
This happens regardless of whether a
and b
are "atomic".