I'm trying to fix two warnings when compiling a specific program using GCC. The warnings are:
warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
and the two culprits are:
unsigned int received_size = ntohl (*((unsigned int*)dcc->incoming_buf));
and
*((unsigned int*)dcc->outgoing_buf) = htonl (dcc->file_confirm_offset);
incoming_buf and outgoing_buf are defined as follows:
char incoming_buf[LIBIRC_DCC_BUFFER_SIZE];
char outgoing_buf[LIBIRC_DCC_BUFFER_SIZE];
This seems subtly different than the other examples of that warning I've been examining. I would prefer to fix the problem rather than disable strict-aliasing checks.
There have been many suggestions to use a union - what might be a suitable union for this case?
Simplified explanation 1. c++ standard states that you should attempt to align data yourself, g++ goes an extra mile to generate warnings on the subject. 2. you should only attempt it if you completely understand the data alignment on your architecture/system and inside your code (for example the code above is a sure thing on Intel 32/64 ; alignment 1; Win/Linux/Bsd/Mac) 3. the only practical reason to use the code above is to avoid compiler warnings , WHEN and IF you know what you are doing
First off, let's examine why you get the aliasing violation warnings.
Aliasing rules simply say that you can only access an object through its own type, its signed / unsigned variant type, or through a character type (
char
,signed char
,unsigned char
).C says violating aliasing rules invokes undefined behavior (so don't!).
In this line of your program:
although the elements of the
incoming_buf
array are of typechar
, you are accessing them asunsigned int
. Indeed the result of the dereference operator in the expression*((unsigned int*)dcc->incoming_buf)
is ofunsigned int
type.This is a violation of the aliasing rules, because you only have the right to access elements of
incoming_buf
array through (see rules summary above!)char
,signed char
orunsigned char
.Notice you have exactly the same aliasing issue in your second culprit:
You access the
char
elements ofoutgoing_buf
throughunsigned int
, so it's an aliasing violation.Proposed solution
To fix your issue, you could try to have the elements of your arrays directly defined in the type you want to access:
(By the way the width of
unsigned int
is implementation defined, so you should consider usinguint32_t
if your program assumesunsigned int
is 32-bit).This way you could store
unsigned int
objects in your array without violating the aliasing rules by accessing the element through the typechar
, like this:or
EDIT:
I've entirely reworked my answer, in particular I explain why the program gets the aliasing warnings from the compiler.
Cast pointer to unsigned and then back to pointer.
unsigned int received_size = ntohl (*((unsigned *)((unsigned) dcc->incoming_buf)) );
If I may, IMHO, for this case, the problem is the design of the ntohl and htonl and related function APIs. They should not have been written as numeric argument with numeric return. (and yes, I understand the macro optimization point) They should have been designed as the 'n' side being a pointer to a buffer. When this is done, the whole problem goes away and the routine is accurate whichever endian the host is. For example (with no attempt to optimize):
To fix the problem, don't pun and alias! The only "correct" way to read a type
T
is to allocate a typeT
and populate its representation if needed:In short: If you want an integer, you need to make an integer. There's no way to cheat around that in a language-condoned way.
The only pointer conversion which you are allowed (for purposes of I/O, generally) is to treat the address of an existing variable of type
T
as achar*
, or rather, as the pointer to the first element of an array of chars of sizesizeof(T)
.