How to show 'preprocessed' code ignoring i

2020-04-16 08:31发布

问题:

I'd like to know if it's possible to output 'preprocessed' code wit gcc but 'ignoring' (not expanding) includes:

ES I got this main:

#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);

int int(){
char str[5] = "test"; 
prn(str);
return 0;
}

I run gcc -E main -o out.c

I got:

/*
all stdio stuff
*/

int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}

I'd like to output only:

#include <stdio.h>
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}

or, at least, just

int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);
return 0;
}

PS: would be great if possible to expand "local" "" includes and not to expand "global" <> includes

回答1:

I agree with Matteo Italia's comment that if you just prevent the #include directives from being expanded, then the resulting code won't represent what the compiler actually sees, and therefore it will be of limited use in troubleshooting.

Here's an idea to get around that. Add a variable declaration before and after your includes. Any variable that is reasonably unique will do.

int begin_includes_tag;
#include <stdio.h>
... other includes
int end_includes_tag;

Then you can do:

> gcc -E main -o out.c | sed '/begin_includes_tag/,/end_includes_tag/d'

The sed command will delete everything between those variable declarations.



回答2:

When cpp expands includes it adds # directives (linemarkers) to trace back errors to the original files.

You can add a post processing step (it can be trivially written in any scripting language, or even in C if you feel like it) to parse just the linemarkers and filter out the lines coming from files outside of your project directory; even better, one of the flags (3) marks system header files (stuff coming from paths provided through -isystem, either implicitly by the compiler driver or explicitly), so that's something you could exploit as well.

For example in Python 3:

#!/usr/bin/env python3
import sys

skip = False
for l in sys.stdin:
    if not skip:
        sys.stdout.write(l)
    if l.startswith("# "):
        toks = l.strip().split(" ")
        linenum, filename = toks[1:3]
        flags = toks[3:]
        skip = "3" in flags

Using gcc -E foo.c | ./filter.py I get

# 1 "foo.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 31 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "foo.c"
# 1 "/usr/include/stdio.h" 1 3 4



# 4 "foo.c"
int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}


回答3:

Protect the #includes from getting expanded, run the preprocessor textually, remove the # 1 "<stdint>" etc. junk the textual preprocessor generates and reexpose the protected #includes.

This shell function does it:

expand_cpp(){
     sed 's|^\([ \t]*#[ \t]*include\)|magic_fjdsa9f8j932j9\1|' "$@" \
     | cpp | sed 's|^magic_fjdsa9f8j932j9||; /^# [0-9]/d'
}

as long as you keep the include word together instead of doing crazy stuff like

#i\
ncl\
u??/
de <iostream>

(above you can see 2 backslash continuation lines + 1 trigraph (??/ == \ ) backslash continuation line).

If you wish, you can protect #ifs #ifdefs #ifndefs #endifs and #elses the same way.

Applied to your example

example.c:

#include <stdio.h>
#define prn(s) printf("this is a macro for printing a string: %s\n", s);

int int(){
char str[5] = "test";
prn(str);
return 0;
}

like as with expand_cpp < example.c or expand_cpp example.c, it generates:

#include <stdio.h>


int int(){
char str[5] = "test";
printf("this is a macro for printing a string: %s\n", str);;
return 0;
}


回答4:

You can use -dI to show the #include directives and post-process the preprocessor output.

Assuming the name of your your file is foo.c

SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
    /^# [0-9]* "/ { if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
    { if(show) print; }'

or to suppress all # line_number "file" lines for $SOURCEFILE:

SOURCEFILE=foo.c
gcc -E -dI "$SOURCEFILE" | awk '
    /^# [0-9]* "/ { ignore = 1; if ($3 == "\"'"$SOURCEFILE"'\"") show=1; else show=0; }
    { if(ignore) ignore=0; else if(show) print; }'

Note: The AWK scripts do not work for file names that include whitespace. To handle file names with spaces you could modify the AWK script to compare $0 instead of $3.



回答5:

supposing the file is named c.c :

gcc -E c.c | tail -n +`gcc -E c.c | grep -n -e "#*\"c.c\""  | tail -1 | awk -F: '{print $1}'`

It seems # <number> "c.c" marks the lines after each #include

Of course you can also save gcc -E c.c in a file to not do it two times

The advantage is to not modify the source nor to remove the #include before to do the gcc -E, that just removes all the lines from the top up to the last produced by an #include ... if I am right