Does x86-SSE-instructions have an automatic releas

As we know from from C11-memory_order: http://en.cppreference.com/w/c/atomic/memory_order

And the same from C++11-std::memory_order: http://en.cppreference.com/w/cpp/atomic/memory_order

On strongly-ordered systems (x86, SPARC, IBM mainframe), release-acquire ordering is automatic. No additional CPU instructions are issued for this synchronization mode, only certain compiler optimizations are affected (e.g. the compiler is prohibited from moving non-atomic stores past the atomic store-release or perform non-atomic loads earlier than the atomic load-acquire)

But is this true for x86-SSE-instructions (except of [NT] - non-temporal, where we always must use L/S/MFENCE)?

Here said, that "sse instructions ... is no requirement on backwards compatibility and memory order is undefined". It is believed that the strict orderability left for compatibility with older versions of processors x86, when it was needed, but new commands, namely SSE(except of [NT]) - deprived automatically release-acquire of order, is it?

标签： c++11 x86 sse c11 memory-barriers

2条回答

别忘想泡老子

2楼-- · 2019-01-27 19:40

Here is an excerpt from Intel's Software Developers Manual, volume 3, section 8.2.2 (the edition 325384-052US of September 2014):

Reads are not reordered with other reads.

Writes are not reordered with older reads.

Writes to memory are not reordered with other writes, with the following exceptions:

writes executed with the CLFLUSH instruction;

streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and

string operations (see Section 8.2.4.1).

Reads may be reordered with older writes to different locations but not with older writes to the same location.

Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions.

Reads cannot pass earlier LFENCE and MFENCE instructions.

Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.

LFENCE instructions cannot pass earlier reads.

SFENCE instructions cannot pass earlier writes.

MFENCE instructions cannot pass earlier reads or writes.

The first three bullets describe the release-acquire ordering, and the exceptions are explicitly listed there. As you might see, only cacheability control instructions (MOVNT*) are in the exception list, while the rest of SSE/SSE2 and other vector instructions obey to the general memory ordering rules, and do not require use of [LSM]FENCE.

0人赞添加讨论(0) 举报

beautiful°

3楼-- · 2019-01-27 19:58

It is true that normal¹ SSE load and store instructions, as well the implied load when using a memory source operand, have the same acquire and release behavior in terms of ordering as normal loads and stores of GP registers.

They are not, however, generally useful directly to implement std::memory_order_acquire or std::memory_order_release operations on std::atomic objects larger than 8 bytes because there is no guarantee of atomicity for SSE or AVX loads and stores of larger than 8 bytes. The missing guarantee isn't just theoretical: there are several implementations (including brand new ones like AMD's Ryzen) that split large loads or stores up into two smaller ones.

¹ I.e., those not listed in the exception list in the accepted answer: NT stores, clflush and string operations.

0人赞添加讨论(0) 举报

Does x86-SSE-instructions have an automatic releas

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间