I tried the following code as a naive attempt to implement swapping of R
and B
bytes in an ABGR
word
#include <stdio.h>
#include <stdint.h>
uint32_t ABGR_to_ARGB(uint32_t abgr)
{
return ((abgr ^= (abgr >> 16) & 0xFF) ^= (abgr & 0xFF) << 16) ^= (abgr >> 16) & 0xFF;
}
int main()
{
uint32_t tmp = 0x11223344;
printf("%x %x\n", tmp, ABGR_to_ARGB(tmp));
}
To my surprise this code "worked" in GCC in C++17 mode - the bytes were swapped
http://coliru.stacked-crooked.com/a/43d0fc47f5539746
But it is not supposed to swap bytes! C++17 clearly states that the RHS of assignment is supposed to be [fully] sequenced before the LHS, which applies to compound assignment as well. This means that in the above expression each RHS of each ^=
is supposed to use the original value of abgr
. Hence the ultimate result in abgr
should simply have B
byte XORed by R
byte. This is what Clang appears to produce (amusingly, with a sequencing warning)
http://coliru.stacked-crooked.com/a/eb9bdc8ced1b5f13
A quick look at GCC assembly
https://godbolt.org/g/1hsW5a
reveals that it seems to sequence it backwards: LHS before RHS. Is this a bug? Or is this some sort of conscious decision on GCC's part? Or am I misunderstanding something?
The exact same behavior is exhibited by int a = 1; (a += a) += a;
, for which GCC calculates a == 4
afterwards and clang a == 3
.
The underlying ambiguity arises from this part of the standard (from working draft N4762):
[expr.ass]: 7.6.18 Assignment and compound assignment operators
Paragraph 1: The assignment operator (=) and the compound assignment operators all group right-to-left. All require a
modifiable lvalue as their left operand; their result is an lvalue referring to the left operand. The result in all
cases is a bit-field if the left operand is a bit-field. In all cases, the assignment is sequenced after the value
computation of the right and left operands, and before the value computation of the assignment expression.
The right operand is sequenced before the left operand. With respect to an indeterminately-sequenced
function call, the operation of a compound assignment is a single evaluation.
Paragraph 7: The behavior of an expression of the form E1 op = E2 is equivalent to E1 = E1 op E2 except that E1 is
evaluated only once. In += and -=, E1 shall either have arithmetic type or be a pointer to a possibly
cv-qualified completely-defined object type. In all other cases, E1 shall have arithmetic type.
GCC seems to be using this rule to internally transfrom (a += a) += a
to (a = a + a) += a
to a = (a = a + a) + a
(since a = a + a
has to be evaluated only once) - and for this expression the sequencing rules are correctly applied.
Clang however seems to do that last transformation step differently: auto temp = a + a; temp = temp + a; a = temp;
Both compilers give a warning about this, though (from the original code):
So the compiler writers know about this ambiguity and decided to prioritize differently (GCC: Paragraph 7 > Paragraph 1; clang: Paragraph 1 > Paragraph 7).
This seems to be a defect in the standard.
Do not make things more complicated than necessary. You can swap the 2 components in a fairly straightforward way without painting yourself into dark corners of the language:
uint32_t ABGR_to_ARGB(uint32_t abgr) {
constexpr uint32_t mask = 0xff00ff00;
uint32_t grab = abgr >> 16 | abgr << 16;
return (abgr & mask) | (grab & ~mask);
}
It also generates much better assembly than the original version. On x86 it uses single rol
instruction for the 3 bitwise operators to produce grab
:
ABGR_to_ARGB(unsigned int):
mov eax, edi
and edi, -16711936
rol eax, 16
and eax, 16711935
or eax, edi
ret