C++ strict aliasing when not using pointer returne

2019-03-27 12:34发布

Can this potentially cause undefined behaviour?

uint8_t storage[4];

// We assume storage is properly aligned here.
int32_t* intPtr = new((void*)storage) int32_t(4);

// I know this is ok:
int32_t value1 = *intPtr;
*intPtr = 5;

// But can one of the following cause UB?
int32_t value2 = reinterpret_cast<int32_t*>(storage)[0];
reinterpret_cast<int32_t*>(storage)[0] = 5;

char has special rules for strict-aliasing. If I use char instead of uint8_t is it still Undefined Behavior? What else changes?

As member DeadMG pointed out, reinterpret_cast is implementation dependent. If I use a C-style cast (int32_t*)storage instead, what would change?

2条回答
放我归山
2楼-- · 2019-03-27 12:57

Your version using the usual placement new is indeed fine.

There is an interpretation1 of §§ 3.8/1 and 3.8/4 where objects of trivial types are able to ‘vanish’ and ‘appear’ on demand. This not a free pass that allows disregarding aliasing rules, so notice:

std::uint16_t storage[2];
static_assert( /* std::uint16_t is not a character type */ );
static_assert( /* storage is properly aligned for our purposes */ );

auto read = *reinterpret_cast<std::uint32_t*>(&storage);
// At this point either we’re attempting to read the value of an
// std::uint16_t object through an std::uint32_t glvalue, a clear
// strict aliasing violation;
// or we’re reading the indeterminate value of a new std::uint32_t
// object freshly constructed in the same storage without effort
// on our part

If on the other hand you swapped the casts around in your second snippet (i.e. reinterpret and write first), you’re not entirely safe either. While under the interpretation you can justify the write to happen on a new std::uint32_t object that reuses the storage implicitly, the subsequent read is of the form

auto value2 = *reinterpret_cast<int32_t*>(storage);

and §3.8/5 says (emphasis mine and extremely relevant):

[…] after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. […] such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type void*, is well-defined.

§3.8/6 is the same but in reference/glvalue form (arguably more relevant since we’re reusing a name and not a pointer here, but the paragraph is imo harder to understand out of context). Also see §3.8/7, which gives some limited leeway that I don’t think applies in your case.

To make things simpler, the remaining problem is this:

T object;
object.~T();
new (&object) U_thats_really_different_from_T;
&object;                     // Is this allowed? What does it mean?
static_cast<void*>(&object); // Is this?

As it so happens if the type of the storage happens to involve a plain or unsigned character type (e.g. your storage really has type unsigned char[4]) then I’d say you have a basis to justify forming a pointer/reference to the storage of the new object (possibly to be reinterpreted later). See e.g. ¶¶ 5 and 6 again, which have an explicit escape clause for forming a pointer/reference/glvalue and §1.8 The C++ object model that describes how an object involves a constituent array of bytes. The rules governing the pointer conversions should be straightforward and uncontroversial (at least by comparison…).


1: it’s hard to gauge how well this interpretation is received in the community — I’ve seen it on the Boost mailing list, where there was some scepticism towards it

查看更多
可以哭但决不认输i
3楼-- · 2019-03-27 13:05

The pointer returned by placement new can be just as UB-causing as any other pointer when aliasing considerations are brought into it. It's your responsibility to ensure that the memory you placed the object into isn't aliased by anything it shouldn't be.

In this case, you cannot assume that uint8_t is an alias for char and therefore has the special aliasing rules applied. In addition, it would be fairly pointless to use an array of uint8_t rather than char because sizeof() is in terms of char, not uint8_t. You'd have to compute the size yourself.

In addition, reinterpret_cast's effect is entirely implementation-defined, so the code certainly does not have a well-defined meaning.

To implement low-level unpleasant memory hacks, the original memory needs to be only aliased by char*, void*, and T*, where T is the final destination type- in this case int, plus whatever else you can get from a T*, such as if T is a derived class and you convert that derived class pointer to a pointer to base. Anything else violates strict aliasing and hello nasal demons.

查看更多
登录 后发表回答