Can someone explain it in a language that mere mortals understand?
问题:
回答1:
[[carries_dependency]]
is used to allow dependencies to be carried across function calls. This potentially allows the compiler to generate better code when used with std::memory_order_consume
for transferring values between threads on platforms with weakly-ordered architectures such as IBM's POWER architecture.
In particular, if a value read with memory_order_consume
is passed in to a function, then without [[carries_dependency]]
, then the compiler may have to issue a memory fence instruction to guarantee that the appropriate memory ordering semantics are upheld. If the parameter is annotated with [[carries_dependency]]
then the compiler can assume that the function body will correctly carry the dependency, and this fence may no longer be necessary.
Similarly, if a function returns a value loaded with memory_order_consume
, or derived from such a value, then without [[carries_dependency]]
the compiler may be required to insert a fence instruction to guarantee that the appropriate memory ordering semantics are upheld. With the [[carries_dependency]]
annotation, this fence may no longer be necessary, as the caller is now responsible for maintaining the dependency tree.
e.g.
void print(int * val)
{
std::cout<<*p<<std::endl;
}
void print2(int * [[carries_dependency]] val)
{
std::cout<<*p<<std::endl;
}
std::atomic<int*> p;
int* local=p.load(std::memory_order_consume);
if(local)
std::cout<<*local<<std::endl; // 1
if(local)
print(local); // 2
if(local)
print2(local); // 3
In line (1), the dependency is explicit, so the compiler knows that local
is dereferenced, and that it must ensure that the dependency chain is preserved in order to avoid a fence on POWER.
In line (2), the definition of print
is opaque (assuming it isn't inlined), so the compiler must issue a fence in order to ensure that reading *p
in print
returns the correct value.
On line (3), the compiler can assume that although print2
is also opaque then the dependency from the parameter to the dereferenced value is preserved in the instruction stream, and no fence is necessary on POWER. Obviously, the definition of print2
must actually preserve this dependency, so the attribute will also impact the generated code for print2
.
回答2:
In short, I think, if there are carries_dependency attribute, the generated code for a function should be optimized for a case, when the actual argument will really come from the another thread and carries a dependency. Similarly for a return value. There may be a lack of the performance if that assumption is not true (for example in single-thread program). But also absence of [[carries_dependency]] may result in bad performance in opposite case... No other effects but the performance alter should happen.
For example, the pointer dereference operation depends on how the pointer was previously obtained, and if the value of the pointer p comes from another thread (by "consume" operation) the value previously assigned by that another thread to *p are taken in account and visible. There may be another pointer q which is equal p (q==p), but as its value does not come from that other thread, the value of *q may seen be different from the *p. Actually *q may provoke a sort of "undefined behavior" (because access memory location out of coordination with the another thread which made assignment).
Really, it seems there are some big bug in the functionality of the memory (and the mind) in certain engineering cases.... >:-)