How a standard library like libc.a (static library) which is included using #include <stdio.h>
in our main.c differ from user defined header file (cube.h) included in main.c with its implementation file (cube.c) in C ?
I mean both are header files but one's implementation is a static library (.a) and others is source file (.c) .
You would have the definition (implementation) in, say, cube.c
#include "cube.h"
int cube( int x ) {
return x * x * x;
}
Then we'll put the function declaration in another file. By convention, this is done in a header file, cube.h in this case.
int cube( int x );
We can now call the function from somewhere else, main.c for instance, by using the #include directive (which is part of the C preprocessor) .
#include "cube.h"
#include <stdio.h>
int main() {
int c = cube( 10 );
printf("%d", c);
...
}
Also if I included include guards in cube.h what would happen when I include cube.h in both main.c and cube.c . Where it will get included?
A programming language is not the same as its implementation.
A programming language is a specification (written on paper; you should read n1570, which practically is the C11 standard), it is not a software. The C standard specifies a C standard library and defines the headers to be
#include
-d.(you could run your C program with a bunch of human slaves and without any computers; that would be very unethical; you could also use some interpreter like Ch and avoid any compiler or object or executable files)
The above sentence is utterly wrong (and makes no sense).
libc.a
does not#include
-or is not included by- the<stdio.h>
header (i.e. file/usr/include/stdio.h
and other internal headers e.g./usr/include/bits/stdio2.h
). That inclusion happens when you compile yourmain.c
orcube.c
.In principle,
<stdio.h>
might not be any file on your computer (e.g.#include <stdio.h>
could trigger some magic in your compiler). In practice, the compiler is parsing/usr/include/stdio.h
(and other included files) when you#include <stdio.h>
.Some standard headers (notably
<setjmp.h>
,<stdreturn.h>
,<stdarg.h>
, ....) are specified by the standard but are implemented with the help of special builtins or attributes (that is "magic" things) of the GCC compiler.The C standard knows about translation units.
Your GCC compiler processes source files (grossly speaking, implementing translation units) and starts with a preprocessing phase (processing
#include
and other directives and expanding macros). Andgcc
runs not only the compiler proper (somecc1
) but also the assembleras
and the linkerld
(read Levine's Linkers and Loaders book for more).For good reasons, your header file
cube.h
should practically start with include guards. In your simplistic example they are probably useless (but you should get that habit).You practically should almost always use
gcc -Wall -Wextra -g
(to get all warnings and debug info). Read the chapter about Invoking GCC.You may pass also
-v
togcc
to understand what programs (e.g.cc1
,ld
,as
) are actually run.You may pass
-H
togcc
to understand what source files are included during preprocessing phase. You can also get the preprocessed form ofcube.c
as thecube.i
file obtained withgcc -C -E cube.c > cube.i
and later look into thatcube.i
file with some editor or pager.You -or
gcc
- would need (in your example) to compilecube.c
(the translation unit given by that file and every header files it is#include
-ing) into thecube.o
object file (assuming a Linux system). You would also compilemain.c
intomain.o
. At lastgcc
would linkcube.o
,main.o
, some startup files (read about crt0) and thelibc.so
shared library (implementing the POSIX C standard library specification and a bit more) to produce an executable. Relocatable object files, shared libraries (and static libraries, if you use some) and executables use the ELF file format on Linux.If you code a C program with several source files (and translation units) you practically should use a build automation tool like GNU make.
These should be two different translation units. And you would compile them in several steps. First you compile
main.c
intomain.o
usingand the above command is producing a
main.o
object file (with the help ofcc1
andas
)Then you compile (another translation unit)
cube.c
usinghence obtaining
cube.o
(notice that adding include guards in your
cube.h
don't change the fact that it would be read twice, once when compilingcube.c
and the other time when compilingmain.c
)At last you link both object files into
yourprog
executable using(I invite you to try all these commands, and also to try them with
gcc -v
instead ofgcc
above).Notice that
gcc -Wall -Wextra -g cube.c main.c -o yourprog
is running all the steps above (check withgcc -v
). You really should write aMakefile
to avoid typing all these commands (and just compile usingmake
, or even bettermake -j
to run compilation in parallel).Finally you can run your executable using
./yourprog
(but read about PATH), but you should learn how to usegdb
and trygdb ./yourprog
.It will get included at both translation units; once when running
gcc -Wall -Wextra -g -c main.c
and another time when runninggcc -Wall -Wextra -g -c cube.c
. Notice that object files (cube.o
andmain.o
) don't contain included headers. Their debug information (in DWARF format) retains that inclusion (e.g. the included path, not the content of the header file).BTW, look into existing free software projects (and study some of their source code, at least for inspiration). You might look into GNU glibc or musl-libc to understand what a C standard library really contains on Linux (it is built above system calls, listed in syscalls(2), provided and implemented by the Linux kernel). For example
printf
would ultimately sometimes use write(2) but it is buffering (see fflush(3)).PS. Perhaps you dream of programming languages (like Ocaml, Go, ...) knowing about modules. C is not one.
TL;DR: the most crucial difference between the C standard library and your library function is that the compiler might intimately know what the standard library functions do without seeing their definition.
First of all, there are 2 kinds of libraries:
The C standard library (and possibly other libraries that are part of the C implementation, like
libgcc
)Any other libraries - which includes all those other libraries in
/usr/lib
,/lib
, etc.., or those in your project.The most crucial difference between a library in category 1 and a library in category 2 library is that the compiler is allowed to assume that every single identifier that you use from category 1 library behaves as if it is the standard library function and behaves as if in the standard and can use this fact to optimize things as it sees fit - this even without it actually linking against the relevant routine from the standard library, or executing it at the runtime. Look at this example:
We compile it, and run:
and correct result is printed out.
So what happens when we ask the user for the number:
then we compile the program:
Surprise, it doesn't link. That is because
sqrt
is in the math library-lm
and you need to link against it to get the definition. But how did it work in the first place? Because the C compiler is free to assume that any function from standard library behaves as if it was as written in the standard, so it can optimize all invocations to it out; this even when we weren't using any-O
switches.Notice that it isn't even necessary to include the header. C11 7.1.4p2 allows this:
Therefore in the following program, the compiler can still assume that the
sqrt
is the one from the standard library, and the behaviour here is still conforming:If you drop the prototype for
sqrt
, and compile the program,A conforming C99, C11 compiler must diagnose constraint violation for implicit function declaration. The program is now an invalid program, but it still compiles (the C standard allows that too). GCC still calculates
sqrt(4)
at compilation time. Notice that we useint
here instead ofdouble
, so it wouldn't even work at runtime without proper declaration for an ordinary function because without prototype the compiler wouldn't know that the argument must bedouble
and not theint
that was passed in (without a prototype, the compiler doesn't know that theint
must be converted to adouble
). But it still works.This is because an implicit function declaration is one with external linkage, and C standard says this (C11 7.1.3):
and Appendix J.2. explicitly lists as undefined behaviour:
I.e. if the program did actually have its own
sqrt
then the behaviour is simply undefined, because the compiler can assume that thesqrt
is the standard-conforming one.