Today, while working with one custom library, I found a strange behavior.
A static library code contained a debug main()
function. It wasn't inside a #define
flag. So it is present in library also. And it is used link to another program which contained the real main()
.
When both of them are linked together, the linker didn't throw a multiple declaration error for main()
. I was wondering how this could happen.
To make it simple, I have created a sample program which simulated the same behavior:
$ cat prog.c
#include <stdio.h>
int main()
{
printf("Main in prog.c\n");
}
$ cat static.c
#include <stdio.h>
int main()
{
printf("Main in static.c\n");
}
$ gcc -c static.c
$ ar rcs libstatic.a static.o
$ gcc prog.c -L. -lstatic -o 2main
$ gcc -L. -lstatic -o 1main
$ ./2main
Main in prog.c
$ ./1main
Main in static.c
How does the "2main" binary find which main
to execute?
But compiling both of them together gives a multiple declaration error:
$ gcc prog.c static.o
static.o: In function `main':
static.c:(.text+0x0): multiple definition of `main'
/tmp/ccrFqgkh.o:prog.c:(.text+0x0): first defined here
collect2: ld returned 1 exit status
Can anyone please explain this behavior?
Quoting ld(1):
When linking 2main, main symbol gets resolved before ld reaches -lstatic, because ld picks it up from prog.o.
When linking 1main, you do have undefined main by the time it gets to -lstatic, so it searches the archive for main.
This logic only applies to archives (static libraries), not regular objects. When you link prog.o and static.o, all symbols from both objects are included unconditionally, so you get a duplicate definition error.
After the linker loads any object files, it searches libraries for undefined symbols. If there are none, then no libraries need to be read. Since main has been defined, even if it finds a main in every library, there is no reason to load a second one.
Linkers have dramatically different behaviors, however. For example, if your library included an object file with both main () and foo () in it, and foo was undefined, you would very likely get an error for a multiply defined symbol main ().
Modern (tautological) linkers will omit global symbols from objects that are unreachable - e.g. AIX. Old style linkers like those found on Solaris, and Linux systems still behave like the unix linkers from the 1970s, loading all of the symbols from an object module, reachable or not. This can be a source of horrible bloat as well as excessive link times.
Also characteristic of *nix linkers is that they effectively search a library only once for each time it is listed. This puts a demand on the programmer to order the libraries on the command line to a linker or in a make file, in addition to writing a program. Not requiring an ordered listing of libraries is not modern. Older operating systems often had linkers that would search all libraries repeatedly until a pass failed to resolve a symbol.
When you link a static library (.a), the linker only searches the archive if there were any undefined symbols tracked so far. Otherwise, it doesn't look at the archive at all. So your
2main
case, it never looks at the archive as it doesn't have any undefined symbols for making the translation unit.If you include a simple function in
static.c
:and call it from
prog.c
, then linker will be forced to look at the archive to find the symbolfun
and you'll get the same multiple main definition error as linker would find the duplicate symbolmain
now.When you directly compile the object files(as in
gcc a.o b.o
), the linker doesn't have any role here and all the symbols are included to make a single binary and obviously duplicate symbols are there.The bottom line is that linker looks at the archive only if there are missing symbols. Otherwise, it's as good as not linking with any libraries.