When is char* safe for strict pointer aliasing?

2019-01-14 11:30发布

I've been trying to understand the strict aliasing rules as they apply to the char pointer.

Here this is stated:

It is always presumed that a char* may refer to an alias of any object.

Ok so in the context of socket code, I can do this:

struct SocketMsg
{
   int a;
   int b;
};

int main(int argc, char** argv)
{
   // Some code...
   SocketMsg msgToSend;
   msgToSend.a = 0;
   msgToSend.b = 1;
   send(socket, (char*)(&msgToSend), sizeof(msgToSend);
};

But then there's this statement

The converse is not true. Casting a char* to a pointer of any type other than a char* and dereferencing it is usually in violation of the strict aliasing rule.

Does this mean that when I recv a char array, I can't reinterpret cast to a struct when I know the structure of the message:

struct SocketMsgToRecv
{
    int a;
    int b;
};

int main()
{
    SocketMsgToRecv* pointerToMsg;
    char msgBuff[100];
    ...
    recv(socket, msgBuff, 100);
    // Ommiting make sure we have a complete message from the stream
    // but lets assume msgBuff[0]  has a complete msg, and lets interpret the msg

    // SAFE!?!?!?
    pointerToMsg = &msgBuff[0];

    printf("Got Msg: a: %i, b: %i", pointerToMsg->a, pointerToMsg->b);
}

Will this second example not work because the base type is a char array and I'm casting it to a struct? How do you handle this situation in a strictly aliased world?

2条回答
男人必须洒脱
2楼-- · 2019-01-14 12:07

Correct, the second example is in violation of the strict aliasing rules, so if you compile with the -fstrict-aliasing flag, there's a chance you may get incorrect object code. The fully correct solution would be to use a union here:

union
{
  SocketMsgToRecv msg;
  char msgBuff[100];
};

recv(socket, msgBuff, 100);

printf("Got Msg: a: %i, b: %i", msg.a, msg.b);
查看更多
时光不老,我们不散
3楼-- · 2019-01-14 12:12

Re @Adam Rosenfield: The union will achieve alignment so long as the supplier of the char* started out doing something similar.

It may be useful to stand back and figure out what this is all about.

The basis for the aliasing rule is the fact that compilers may place values of different simple types on different memory boundaries to improve access and that hardware in some cases may require such alignment to be able to use the pointer at all. This can also show up in structs where there is a variety of different-sized elements. The struct may be started out on a good boundary. In addition, the compiler may still introduce slack bites in the interior of the struct to accomplish proper alignment of the struct elements that require it.

Considering that compilers often have options for controlling how all of this is handled, or not, you can see that there are many ways that surprises can occur. This is particularly important to be aware of when passing pointers to structs (cast as char* or not) into libraries that were compiled to expect different alignment conventions.

What about char*?

The presumption about char* is that sizeof(char) == 1 (relative to the sizes of all other sizable data) and that char* pointers don't have any alignment requirement. So a genuine char* can always be safely passed around and used successfully without concern for alignment, and that goes for any element of a char[] array, performing ++ and -- on the pointers, and so on. (Oddly, void* is not quite the same.)

Now you should be able to see how if you transfer some sort of structure data into a char[] array that was not itself aligned appropriately, attempting to cast back to a pointer that does require alignment(s) can be a serious problem.

If you make a union of a char[] array and a struct, the most-demanding alignment (i.e., that of the struct) will be honored by the compiler. This will work if the supplier and the consumer are effectively using matching unions so that casting of the struct* to char* and back works just fine.

In that case, I would hope that the data was created in a similar union before the pointer to it was cast to char* or it was transferred any other way as an array of sizeof(char) bytes. It is also important to make sure any compiler options are compatible between the libraries relied upon and your own code.

查看更多
登录 后发表回答