Filtering directories when parsing cpp files in ge

2019-07-19 06:25发布

问题:

I should write a python-clang parser which returns all the inclusions in cpp files. So I use sth like the following code:

def _main():
    from clang.cindex import Index
    from optparse import OptionParser

    filter=['/usr/lib','usr/include']
    p=OptionParser()
    (o,a)=p.parse_args()
    i=Index.create()
    t=i.parse(None,a)
    for i in t.get_includes():
        print i.include

if __name__=='__main__':
    _main()

Now i need to filter just some of inclusions like specific directories:

filter=['/usr/lib','usr/include']

Question 1: I would like to know how this filtering is possible and how my code should change?

Question 2: How to make a config file to include all these filter direstories in that instead of just writting them hardcoded?

to run the test: you need to have a cpp file like:

  #include<iostream>
  #include"ex1.h"

  int main(){
      return 0;
  }

and *.h file:

 #include<QMap>

run:

./python-clang.py ex1.cpp

sample of results:

 /usr/include/pthread.h
 /usr/include/sched.h
 /usr/include/time.h
 /usr/include/bits/sched.h
 /usr/include/time.h
 /usr/include/bits/time.h
 /usr/include/signal.h
 /usr/include/bits/sigset.h
 /usr/include/bits/pthreadtypes.h
 /usr/include/bits/wordsize.h
 /usr/include/bits/setjmp.h
 /usr/include/bits/wordsize.h
 /usr/include/bits/wordsize.h
 /usr/include/unistd.h
 /usr/include/bits/posix_opt.h
 /usr/include/bits/environments.h
 /usr/include/bits/wordsize.h
 /usr/include/bits/confname.h
 /usr/include/getopt.h
 /usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/i486-linux-gnu/bits      /atomic_word.h
/usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/bits/locale_classes.h
/usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/string
/usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/bits/allocator.h
/usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/i486-linux-gnu/bits/c++allocator.h
/usr/lib/gcc/i486-linux-gnu/4.4/../../../../include/c++/4.4/ext/new_allocator.h

回答1:

You can do this in your for loop:

...
for i in t.get_includes():
    if not i.include in filter:
        print i.include
...

As for the config file containing the exclusions. You could do something like this:

def _main():
    ...
    with open('/path/to/file/ignore.txt') as f:
        filter = f.readlines()
    ...

Then in ignore.txt:

/usr/lib
/usr/include
...

UPDATE

Based on your comments and edits to your question.

def _main():
    ...
    with open('/path/to/file/ignore.txt') as f:
        ignore = map(lambda l: l.strip(), f.readlines())

    for i in t.get_includes():
        if not i.include.startswith(ignore):
            print i.include

Couple of things to note here.

  1. I've changed the variable name filter to ignore since filter is a built-in type.
  2. The lines in ignore.txt are having the \n stripped and mapped to a tuple instead of a list so they can be used with the startswith method when being read.
  3. You could also use list comprehension to put the filtered results into a list to be used later.

results = [i.include for i in t.get_includes() if not i.startswith(ignore)]