Am I right in saying linkers make no function parameter checks. They do not check the number or types of function calls nor do they check the type of global data references. Is this true for all linkers?
I'm using Clang targeting Linux on x86-64. Does the linker check that references are in the right segment? Or is an external reference in effect just a void * as far as the linker is concerned?
I'm coming from a high level language background C# and Scala, so this may seem obvious to those that have immersed themselves in the low level world. I've written a couple of my functions (system calls) in assembler and I noticed there were no parameter prototypes for external functions in the assembler.
Context: I'm actually writing a compiler. For the moment I'm targeting preprocessed C .i files with assembler functions for system calls, but the alternatives are C++, assembler or even machine code, so I'm trying to weigh the costs and benefits, particularly the type checking, of the assembler / compiler / linker I can use to check the correctness of my own programme and its function prototype generation.
As @Yakk explained, functions can be overloaded based on their parameters, and so the compiler generates mangled function names that include information about parameters and their types. The linker mostly just checks symbol names and sizes, but since mangling ensures that functions' names are different, mismatched parameters won't link.
Function return types aren't part of the mangling (because overloading on return type isn't legal), so if you declare int test()
in one translation unit and call float test()
in another, the linker won't catch it, and you'll get bad results.
Similarly, the types of global variables (and static members of classes and so on) aren't checked by the linker, so if you declare extern int test;
in one translation unit and define float test;
in another, you'll get bad results.
The linker can, in some circumstances, compare the size of a symbol in two different translation units and can catch a few problems in this way.
In practice, this is rarely an issue in normal C++ development, because whenever a function or variable or class is needed by >1 translation unit, you'll declare it in a header file that's included in both translation units, and the compiler will catch any errors before the linker even runs. (One instance where it can be an issue is if you're using an external, binary library and the header files you have for it don't match the library.)
In c++ all compilers implement some form of name mangling to separate overloaded functions; however as return types are not included in the mangle (usually) the same issue might exist here anyway.
In C you are correct - the linker can't check but this is really not as serious an issue as you think. Remember the compiler has already checked that the calls to a function match the headers provided so the only way to cause a problem is to have different duplicate versions of the header file compiled into two different c files that are later linked.
This is kind of hard to do accidentally (although if you manage you might end up with some really subtle errors).
Many linkers include features to provide a some level of type checking, but the details vary. Some compilers will prefix the names of functions that use one calling convention with underscores, but omit the underscores from the names of functions that use a different calling convention; if one translation unit declares a function using one convention, but the actual function is defined using another, the program will be rejected at link time.
Some platforms (e.g. HiTech C for the PIC) allow a compiler or an assembly-language program to specify a 16-bit value when declaring or referencing a symbol, and will squawk if the point of reference supplies a value that doesn't match the definition. The C compiler generates for each function a hash value based on the combination of parameter types and return types, and the linker will squawk if an attempt is made to call a function whose signature-hash value differs at the definition and call site.