Suppose I have two files:
==File1==
extern char* foo;
==File2==
double foo;
These two files seem to compile and link just fine with both g++ and clang++ despite the type mismatch. As I understand it the recommended practice is to put the extern declaration in a header which both files include so File2 will throw a redefinition error.
My questions are:
- Does this result in undefined behavior according to the c++ standard? If not what goes in foo in File1?
- Could linkers catch this kind of type mismatch?
Does this result in undefined behavior according to the c++ standard?
Well, the real question is whether this is undefined behavior or whether it is specified by the standard as being ill-formed (in standard parlance). Because, obviously, it is not correct. I have tried to find something from the standard about this, but to no avail. However, in a number of similar situations, e.g., mismatches of decl/def or throwing funky things at the linker (see section 3.5, 7.5, or search for "extern" or "linkage"), the standard generally ends up saying:
The program is ill-formed, no diagnosis required.
So, I would bet it's pretty safe to assume this is the case here too. This would mean that this is erroneous code, worst than "undefined behavior", since UB often will have some kind of reasonable behaviour for a specific implementation (although you shouldn't speculate on what that behaviour would be, and certainly not rely on that speculation). The "ill-formed" term is used very liberally in the standard, and you can more or less infer that it means that the code is FUBAR. This would also mean that the linker is not required, by the standard, to be implemented in a way that allows it to catch this kind of error, and that's why it compiles and links correctly, but hold on to your socks when you run it.
Could linkers catch this kind of type mismatch?
In theory, yes. A linker implementation could encode (with name-mangling) the type of the variable into its external symbol, and thus be able to either restrict linkage to things whose types match (e.g., like overloaded functions), or throw a diagnosis (error) when it encounters a mismatch in types. I think that the former would be too permissive as compared to the standard.
However, all compilers that I know of do not mangle the names of variables, and thus, you can assume that such a mismatched is "ill-formed, no diagnosis required".
Does this result in undefined behavior according to the c++ standard?
yes, definitely.
If not what goes in foo in File1?
Uhm, it will have the value zero, since double foo;
is not set to anything. But since you can't rely on the value zero for doubles is actually the same as NULL for a pointer, there is no telling what happens if you try to somehow compare foo
as a pointer with foo
as a double.
Naturally, if pointer foo
is assigned, e.g. foo = malloc(100);
, then double foo
will contain the bits of the pointer, which is most likely not a particularly good floating point number - quite possibly an illegal one on a 64-bit system, as the top several bits will most likely be zero, which tends to mean that the value is zero, in which case the rest of the bits should also be zero. Although, it does depend on the internal format of floating point, and the actual value of the pointer
Could linkers catch this kind of type mismatch?
No, names for variables are not "mangled" for type, even in C++. Only functions have the types of the function encoded into the function name.
Technically, the linker or some compiler component COULD track what type the symbol represents, and then give an error or warning. But there is no requirement from the standard to do so. You are required to do the right thing.