In ANSI C, offsetof is defined as below.
#define offsetof(st, m) \
((size_t) ( (char *)&((st *)(0))->m - (char *)0 ))
Why won't this throw a segmentation fault since we are dereferencing a NULL pointer? Or is this some sort of compiler hack where it sees that only address of the offset is taken out, so it statically calculates the address without actually dereferencing it? Also is this code portable?
In ANSI C,
offsetof
is NOT defined like that. One of the reasons it's not defined like that is that some environments will indeed throw null pointer exceptions, or crash in other ways. Hence, ANSI C leaves the implementation ofoffsetof( )
open to compiler builders.The code shown above is typical for compilers/environments that do not actively check for NULL pointers, but fail only when bytes are read from a NULL pointer.
To answer the last part of the question, the code is not portable.
The result of subtracting two pointers is defined and portable only if the two pointers point to objects in the same array or point to one past the last object of the array (7.6.2 Additive Operators, H&S Fifth Edition)
At no point in the above code is anything dereferenced. A dereference occurs when the
*
or->
is used on an address value to find referenced value. The only use of*
above is in a type declaration for the purpose of casting.The
->
operator is used above but it's not used to access the value. Instead it's used to grab the address of the value. Here is a non-macro code sample that should make it a bit clearerThe second line does not actually cause a dereference (implementation dependent). It simply returns the address of
SomeIntMember
within thepSomeType
value.What you see is a lot of casting between arbitrary types and char pointers. The reason for char is that it's one of the only type (perhaps the only) type in the C89 standard which has an explicit size. The size is 1. By ensuring the size is one, the above code can do the evil magic of calculating the true offset of the value.
Listing 1: A representative set of
offsetof()
macro definitionsThe various operators within the macro are evaluated in an order such that the following steps are performed:
((s *)0)
takes the integer zero and casts it as a pointer tos
.((s *)0)->m
dereferences that pointer to point to structure memberm
.&(((s *)0)->m)
computes the address ofm
.(size_t)&(((s *)0)->m)
casts the result to an appropriate data type.By definition, the structure itself resides at address 0. It follows that the address of the field pointed to (Step 3 above) must be the offset, in bytes, from the start of the structure.
Although that is a typical implementation of
offsetof
, it is not mandated by the standard, which just says:Read P J Plauger's "The Standard C Library" for a discussion of it and the other items in
<stddef.h>
which are all border-line features that could (should?) be in the language proper, and which might require special compiler support.It's of historic interest only, but I used an early ANSI C compiler on 386/IX (see, I told you of historic interest, circa 1990) that crashed on that version of
offsetof
but worked when I revised it to:That was a compiler bug of sorts, not least because the header was distributed with the compiler and didn't work.
It doesn't segfault because you're not dereferencing it. The pointer address is being used as a number that's subtracted from another number, not used to address memory operations.