Why does optimisation kill this function?

2019-01-01 15:27发布

We recently had a lecture in university about programming specials in several languages.

The lecturer wrote down the following function:

inline u64 Swap_64(u64 x)
{
    u64 tmp;
    (*(u32*)&tmp)       = Swap_32(*(((u32*)&x)+1));
    (*(((u32*)&tmp)+1)) = Swap_32(*(u32*) &x);

    return tmp;
}

While I totally understand that this is also really bad style in terms of readability, his main point was that this part of code worked fine in production code until they enabled a high optimization level. Then, the code would just do nothing.

He said that all the assignments to the variable tmp would be optimized out by the compiler. But why would this happen?

I understand that there are circumstances where variables need to be declared volatile so that the compiler doesn't touch them even if he thinks that they are never read or written but I wouldn't know why this would happen here.

3条回答
春风洒进眼中
2楼-- · 2019-01-01 15:51

This code violates the strict aliasing rules which makes it illegal to access an object through a pointer of a different type, although access through a *char ** is allowed. The compiler is allowed to assume that pointers of different types do not point to the same memory and optimize accordingly. It also means the code invokes undefined behavior and could really do anything.

One of the best references for this topic is Understanding Strict Aliasing and we can see the first example is in a similar vein to the OP's code:

uint32_t swap_words( uint32_t arg )
{
  uint16_t* const sp = (uint16_t*)&arg;
  uint16_t        hi = sp[0];
  uint16_t        lo = sp[1];

  sp[1] = hi;
  sp[0] = lo;

 return (arg);
} 

The article explains this code violates strict aliasing rules since sp is an alias of arg but they have different types and says that although it will compile, it is likely arg will be unchanged after swap_words returns. Although with simple tests, I am unable to reproduce that result with either the code above nor the OPs code but that does not mean anything since this is undefined behavior and therefore not predictable.

The article goes on to talk about many different cases and presents several working solution including type-punning through a union, which is well-defined in C991 and may be undefined in C++ but in practice is supported by most major compilers, for example here is gcc's reference on type-punning. The previous thread Purpose of Unions in C and C++ goes into the gory details. Although there are many threads on this topic, this seems to do the best job.

The code for that solution is as follows:

typedef union
{
  uint32_t u32;
  uint16_t u16[2];
} U32;

uint32_t swap_words( uint32_t arg )
{
  U32      in;
  uint16_t lo;
  uint16_t hi;

  in.u32    = arg;
  hi        = in.u16[0];
  lo        = in.u16[1];
  in.u16[0] = lo;
  in.u16[1] = hi;

  return (in.u32);
}

For reference the relevant section from the C99 draft standard on strict aliasing is 6.5 Expressions paragraph 7 which says:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:76)

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

and footnote 76 says:

The intent of this list is to specify those circumstances in which an object may or may not be aliased.

and the relevant section from the C++ draft standard is 3.10 Lvalues and rvalues paragraph 10

The article Type-punning and strict-aliasing gives a gentler but less complete introduction to the topic and C99 revisited gives a deep analysis of C99 and aliasing and is not light reading. This answer to Accessing inactive union member - undefined? goes over the muddy details of type-punning through a union in C++ and is not light reading either.


Footnotes:

  1. Quoting comment by Pascal Cuoq: [...]C99 that was initially clumsily worded, appearing to make type-punning through unions undefined. In reality, type-punning though unions is legal in C89, legal in C11, and it was legal in C99 all along although it took until 2004 for the committee to fix incorrect wording, and the subsequent release of TC3. open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm
查看更多
君临天下
3楼-- · 2019-01-01 16:03

In C++, pointer arguments are assumed not to alias (except char*) if they point to fundamentally different types ("strict aliasing" rules). This allows some optimizations.

Here, u64 tmp is never modified as u64.
A content of u32* is modified but may be unrelated to 'u64 tmp' so may be seen as nop for u64 tmp.

查看更多
宁负流年不负卿
4楼-- · 2019-01-01 16:05

g++ (Ubuntu/Linaro 4.8.1-10ubuntu9) 4.8.1:

> g++ -Wall -std=c++11 -O0 -o sample sample.cpp

> g++ -Wall -std=c++11 -O3 -o sample sample.cpp
sample.cpp: In function ‘uint64_t Swap_64(uint64_t)’:
sample.cpp:10:19: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
     (*(uint32_t*)&tmp)       = Swap_32(*(((uint32_t*)&x)+1));
                   ^
sample.cpp:11:54: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
     (*(((uint32_t*)&tmp)+1)) = Swap_32(*(uint32_t*) &x);
                                                      ^

Clang 3.4 doesn't warn in any optimization level, which is curious...

查看更多
登录 后发表回答