Example from MSVC's implementation:
#define offsetof(s,m) \
(size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
// ^^^^^^^^^^^
As can be seen, it dereferences a null pointer, which normally invokes undefined behaviour. Is this an exception to the rule or what is going on?
When the C Standard specifies that certain actions invoke Undefined Behavior, that does has not generally meant that such actions were forbidden, but rather that implementations were free to specify the consequent behaviors or not as they see fit. Consequently, implementations would be free to perform such actions in cases where the Standard requires defined behavior, if and only if the implementations can guarantee that the behaviors for those actions will be consistent with what the Standard requires. Consider, for example, the following implementation of strcpy:
If
src
anddest
are unrelated pointers, the computation ofdest-src
would yield Undefined Behavior. On some platforms, however, the relation betweenchar*
andptrdiff_t
is such that given anychar* p1, p2
, the computationp1 + (p2-p1);
will always equalp2
. On platforms which make that guarantee, the above implementation ofstrcpy
would be legitimate (and on some such platforms might be faster than any plausible alternative). On some other platforms, however, such a function might always fail except when both strings are part of the same allocated object.The same principle applies to the
offsetof
macro. There is no requirement that compilers offer any way to get behavior equivalent tooffsetof
(other than by actually using that macro) If a compiler's model for pointer arithmetic makes it possible to get the requiredoffsetof
behavior by using the->
operator on a null pointer, then itsoffsetof
macro can do that. If a compiler wouldn't support any efforts to use->
on something other than a legitimate pointer to an instance of the type, then it may need to define an intrinsic which can compute a field offset and define theoffsetof
macro to use that. What's important is not that the Standard define the behaviors of actions performed using standard-library macros and functions, but rather than the implementation ensures that behaviors of such macros and functions match requirements.This is basically equivalent to asking whether this is UB:
Clearly no memory access is generated to the target of
r
, because it'svolatile
and the compiler is prohibited from generating spurious accesses tovolatile
variables. But*s
is not volatile, so the compiler could possibly generate an access to it. Neither the address-of operator nor casting to reference type creates an unevaluated context according to the standard.So, I don't see any reason for the
volatile
, and I agree with the others that this is undefined behavior according to the standard. Of course, any compiler is permitted to define behavior where the standard leaves it implementation-specified or undefined.Finally, a note in section
[dcl.ref]
saysThe notion of "undefined behavior" is not applicable to the implementation of the Standard Library, regardless of whether it is a macro, a function or anything else.
In general case, the Standard Library should not be seen as implemented in C++ (or C) language. That applies to standard header files as well. The Standard Library should conform to its external specification, but everything else is an implementation detail, exempt from all and any other requirements of the language. The Standard Library should be always thought of as implemented in some "internal" language, which might closely resemble C++ or C, but still is not C++ or C.
In other words, the macro you quoted does not produce undefined behavior, as long as it is specifically the
offsetof
macro defined in the Standard Library. But if you do exactly the same thing in your code (like define your own macro in the very same way), it will indeed result in undefined behavior. "Quod licet Jovi, non licet bovi".Where the language standard says "undefined behavior", any given compiler can define the behavior. Implementation code in the standard library typically relies on that. So there are two questions:
(1) Is the code UB with respect to the C++ standard?
That's a really hard question, because it's a well known almost-defect that the C++98/03 standard never says right out in normative text that in general it's UB to dereference a nullpointer. It is implied by the exception for
typeid
, where it's not UB.What you can say decidedly is that it's UB to use
offsetof
with a non-POD type.(2) Is the code UB with respect to the compiler that it's written for?
No, of course not.
A compiler vendor's code for a given compiler can use any feature of that compiler.
Cheers & hth.,
No, this is NOT undefined behaviour. The expression is resolved at runtime.
Note that it is taking the address of the member
m
from a null pointer. It is NOT dereferencing the null pointer.