What is the impact SFENCE and LFENCE to caches of

2019-02-19 03:34发布

From the speech Herb Sutter in the figure of the slides on page 2: https://skydrive.live.com/view.aspx?resid=4E86B0CF20EF15AD!24884&app=WordPdf&wdo=2&authkey=!AMtj_EflYn2507c enter image description here

Here are shown separate cache-L1S and Store Buffer (SB).

1. In processors Intel x86 cache-L1 and Store Buffer - is the same thing?

And next slide: enter image description here

As we see from the next slide in the x86 is only possible following reordering. was:

MOV eax, [memory1] / / read
MOV [memory2], edx / / write
... / / MOV, MFENCE, ADD ... any other code

became:

MOV [memory2], edx / / write
MOV eax, [memory1] / / read
... / / MOV, MFENCE, ADD ... any other code

This is due to the unordered execution in the processor pipeline.

2. But can you show another example similar to this - how does affect on reordering Store Buffer?

3. And the main question - how to influences LFENCE and SFENCE on caches of neighboring cores?

Is correct to say that:

  1. SFENCE makes "push", ie makes flush for Store Buffer->L1, and then sends changes from the caches of Core0-L1/L2 to all other cores Core1/2/3...-L1/L2?
  2. LFENCE makes "pull", ie receives changes from caches of all other Core1/2/3...-L1/L2( and Store Buffer?) in our core Core0-L1/L2?

1条回答
做个烂人
2楼-- · 2019-02-19 03:42
  1. The store buffer is not a cache, it's an ordering queue. It holds pending stores, while the cache can be thought of as a logical part of memory (i.e. - everything in any of the caches is visible to all other agents and must answer correctly to snoops)

  2. Stores are not reordered, that would break memory ordering as they would become immediately visible (unlike loads who only affect internal state).

  3. fences do not work on caches, and have nothing to do with other cores. Caches are already fully visible and synched. fences only apply for execution order (in case it's done out-of-order internally), and therefore apply only for the current context.

Is correct to say that:

  1. SFENCE makes "push", ie makes flush for Store Buffer->L1, and then sends changes from the caches of Core0-L1/L2 to all other cores Core1/2/3...-L1/L2?
  2. LFENCE makes "pull", ie receives changes from caches of all other Core1/2/3...-L1/L2( and Store Buffer?) in our core Core0-L1/L2?

sfence/mfence would flush the store buffer as they won't allow pending speculative stores to remain (that's why they're fencing). However as I said - once they changes are in L1 they're already observable by anyone, they don't have to be flushed anywhere further away.

In the same sense, lfence doesn't "pull" anything, it just stalls the execution of all younger loads until the older ones (and the fence itself) have finished and committed. This will affect performance by serializing the loads, but would not otherwise protect you against any operation in other cores, unless you have another way to make sure any store you require would have been performed by then (and in that case - update the load result in time).

查看更多
登录 后发表回答