Function prototype in header file doesn't matc

2019-04-22 03:56发布

问题:

(I found this question which is similar but not a duplicate: How to check validity of header file in C programming language )

I have a function implementation, and a non-matching prototype (same name, different types) which is in a header file. The header file is included by a C file that uses the function, but is not included in the file that defines the function.

Here is a minimal test case :

header.h:

void foo(int bar);

File1.c:

#include "header.h"
int main (int argc, char * argv[])
{
    int x = 1;
    foo(x);
    return 0;
}

File 2.c:

#include <stdio.h>

typedef struct {
    int x;
    int y;
} t_struct;

void foo (t_struct *p_bar)
{
    printf("%x %x\n", p_bar->x, p_bar->y);
}

I can compile this with VS 2010 with no errors or warnings, but unsurprisingly it segfaults when I run it.

  • The compiler is fine with it (this I understand)
  • The linker did not catch it (this I was slightly surprised by)
  • The static analysis tool (Coverity) did not catch it (this I was very surprised by).

How can I catch these kinds of errors?

[Edit: I realise if I #include "header.h" in file2.c as well, the compiler will complain. But I have an enormous code base and it is not always possible or appropriate to guarantee that all headers where a function is prototyped are included in the implementation files.]

回答1:

Have the same header file included in both file1.c and file2.c. This will pretty much prevent a conflicting prototype.

Otherwise, such a mistake cannot be detected by the compiler because the source code of the function is not visible to the compiler when it compiles file1.c. Rather, it can only trust the signature that has been given.

At least theoretically, the linker could be able to detect such a mismatch if additional metadata is stored in the object files, but I am not aware if this is practically possible.



回答2:

-Werror-implicit-function-declaration, -Wmissing-prototypes or equivalent on one of your supported compilers. then it will either error or complain if the declaration does not precede the definition of a global.

Compiling the programs in some form of strict C99 mode should also generate these messages. GCC, ICC, and Clang all support this feature (not sure about MS's C compiler and its current status, as VS 2005 or 2008 was the latest I've used for C).



回答3:

You may use the Frama-C static analysis platform available at http://frama-c.com.

On your examples you would get:

$ frama-c 1.c 2.c
[kernel] preprocessing with "gcc -C -E -I.  1.c"
[kernel] preprocessing with "gcc -C -E -I.  2.c"
[kernel] user error: Incompatible declaration for foo:
                     different type constructors: int vs. t_struct *
                     First declaration was at  header.h:1
                     Current declaration is at 2.c:8
[kernel] Frama-C aborted: invalid user input.

Hope this helps!



回答4:

Looks like this is not possible with C compiler because of its way how function names are mapped into symbolic object names (directly, without considering actual signature).

But this is possible with C++ because it uses name mangling that depends on function signature. So in C++ void foo(int) and void foo(t_struct*) will have different names on linkage stage and linker will raise error about it.

Of course, that will not be easy to switch a huge C codebase to C++ in turn. But you can use some relatively simple workaround - e.g. add single .cpp file into your project and include all C files into it (actually generate it with some script).

Taking your example and VS2010 I added TestCpp.cpp to project:

#include "stdafx.h"

namespace xxx
{
#include "File1.c"
#include "File2.c"
}

Result is linker error LNK2019:

TestCpp.obj : error LNK2019: unresolved external symbol "void __cdecl xxx::foo(int)" (?foo@xxx@@YAXH@Z) referenced in function "int __cdecl xxx::main(int,char * * const)" (?main@xxx@@YAHHQAPAD@Z)
W:\TestProjects\GenericTest\Debug\GenericTest.exe : fatal error LNK1120: 1 unresolved externals

Of course, this will not be so easy for huge codebase, there can be other problems leading to compilation errors that cannot be fixed without changing codebase. You can partially mitigate it by protecting .cpp file contents with conditional #ifdef and use only for periodical checks rather than for regular builds.



回答5:

Every (non-static) function defined in every foo.c file should have a prototype in the corresponding foo.h file, and foo.c should have #include "foo.h". (main is the only exception.) foo.h should not contain prototypes for any functions not defined in foo.c.

Every function should prototyped exactly once.

You can have .h files with no corresponding .c files if they don't contain any prototypes. The only .c file without a corresponding .h file should be the one containing main.

You already know this, and your problem is that you have a huge code base where this rule has not been followed.

So how do you get from here to there? Here's how I'd probably do it.

Step 1 (requires a single pass over your code base):

  • For each file foo.c, create a file foo.h if it doesn't already exist. Add "#include "foo.h" near the top of foo.c. If you have a convention for where .h and .c files should live (either in the same directory or in parallel include and src directories, follow it; if not, try to introduce such a convention).
  • For each function in foo.c, copy its prototype to foo.h if it's not already there. Use copy-and-paste to ensure that everything stays consistent. (Parameter names are optional in prototypes and mandatory in definitions; I suggest keeping the names in both places.)
  • Do a full build and fix any problems that show up.

This won't catch all your problems. You could still have multiple prototypes for some functions. But you'll have caught any cases where two headers have inconsistent prototypes for the same function and both headers are included in the same translation unit.

Once everything builds cleanly, you should have a system that's at least as correct as what you started with.

Step 2:

  • For each file foo.h, delete any prototypes for functions that aren't defined in foo.c.
  • Do a full build and fix any problems that show up. If bar.c calls a function that's defined in foo.c, then bar.c needs a #include "foo.h".

For both of these steps, the "fix any problems that show up" phase is likely to be long and tedious.

If you can't afford to do all this at once, you can probably do a lot of it incrementally. Start with one or a few .c files, clean up their .h files, and remove any extra prototypes declared elsewhere.

Any time you find a case where a call uses an incorrect prototype, try to figure out the circumstances in which that call is executed, and how it causes your application to misbehave. Create a bug report and add a test to your regression test suite (you have one, right?). You can demonstrate to management that the test now passes because of all the work you've done; you really weren't just messing around.

Automated tools that can parse C are likely to be useful. Ira Baxter has some suggestions. ctags may also be useful. Depending on how your code is formatted, you can probably throw together some tools that don't require a full C parser. For example, you might use grep, sed, or perl to extract a list of function definitions from a foo.c file, then manually edit the list to remove false positives.



回答6:

Its obvious ("I have a huge code base") you cannot do this by hand.

What you need is an automated tool that can read your source files as the compiler sees them, collect all function prototypes and definitions, and verify that all definitions/prototypes match. I doubt you'll find such a tool lying around.

Of course, this match much check the signature, and this requires something like the compiler's front end to compare the signatures.

Consider

     typedef int T;
     void foo(T x);

in one compilation unit, and

      typedef float T;
      void foo(T x);

in another. You can't just compare the signature "lines" for equality; you need something that can resolve the types when checking.

GCCXML may be able to help, if you are using a GCC dialect of C; it extracts top-level declarations from source files as XML chunks. I don't know if it will resolve typedefs, though. You obviously have to build (considerable) support to collect the definitions in a central place (a database) and compare them. Comparing XML documents for equivalents is at least reasonably straightforward, and pretty easy if they are formatted in a regular way. This is likely your easiest bet.

If that doesn't work, you need something that has a full C front end that you can customize. GCC is famously available, and famously hard to customize. Clang is available, and might be pressed into service for this, but AFAIK only works with GCC dialects.

Our DMS Software Reengineering Toolkit has C front ends (with full preprocessing capability) for many dialects of C (GCC, MS, GreenHills, ...) and builds symbol tables with complete type information. Using DMS you might be able (depending on the real scale of your application) to simply process all the compilation units, and build just the symbol tables for each compilation unit. Checking that symbol table entries "match" (are compatible according to compiler rules including using equivalent typedefs) is built-into the C front ends; all one needs to do is orchestrate the reading, and calling the match logic for all symbol table entries at global scope across the various compilation units.

Whether you do this with GCC/Clang/DMS, it is a fair amount of work to cobble together a custom tool. So you have decide how critical you need for fewer suprises is, compared to the energy to build such a custom tool.