Does SFENCE prevent the Store Buffer hiding change

2019-02-15 11:29发布

问题:

If a Core writes but the cache line is not present in its L1, so it writes to the Store Buffer. Another Core requests that cache line, MESI cannot see the Store Buffer update and returns the unmodified cache line. The Store Buffer is flushed shortly after, but the second Core already uses the older value.

I don't see how an SFENCE solves this problem? Yes the cache line will be updated sooner, but the Core still needs to wait to write the value to L1 and during this time the second Core can request to read?

回答1:

No, it doesn't prevent the core from "hiding" the stores from MESI (perhaps better called something like the cache coherent domain). In fact, as pointed out in the comments to the OP, SFENCE has no effect on normal x86 stores which are already strongly ordered. It is only useful to put a fence between stores at least one of which is an NT store, or a store to WC memory, etc.

The "hiding" here isn't really problematic. The x86 has a "total store" order in which there is a single global order of stores that is observed by most operations. This order is basically the order at which stores leave the store buffer (are committed to L1). It is not the order at which they enter the buffer, or even the order in which the stores retire. So when a store is still in the store buffer, it effectively hasn't occurred in the total store order, and is invisible in the cache coherent domain.

The only way this causes reordering (on x86), is that this allows later loads to apparently pass earlier stores: a later load reads from the "global order" when it executes (e.g., hits in L1), but an earlier store may still be sitting in the store buffer, which (as above) means it hasn't become part of the global order yet. To prevent that reordering would be performance prohibitive, but all the other orderings are prevented just by keeping things in order (load-load, and store-store) and some other mechanism which ensures later stores don't get committed until earlier loads have completed.

If you want to "solve" the store buffer problem, then mfence is your solution. It effectively flushes the store buffer before proceeding.



回答2:

As stated in the previous answers, stores will (eventually) become globally visible on other cores in the order they're issued (program order). 'Eventually' is the key, as SFENCE enforces a literal fence on the cycle when the store buffer is drained and the writes on the buffer are made 'globally visibe'.

So, yes, SFENCE instructions cause data in the store buffer to be drained to the cache. This is explained in Section 11.10 of the software developer's manual (SDM).

The SFENCE instruction is also described as:

This serializing operation guarantees that every store instruction that precedes the SFENCE instruction in program order becomes globally visible before any store instruction that follows the SFENCE instruction.

Reads passing writes in the buffer is irrelevant in this context.