Python: code statistics

2020-06-03 05:47发布

问题:

Do you know if there's a Python library that generates statistics about code? I'm thinking about pointing to a package and getting number of classes, functions, methods, docblock lines etc.

It could eventually include useless stuff like number of lambdas or other crazy statistics, just for fun.

回答1:

you can have a look at Pymetrics, or check other tools enumerated there



回答2:

People don't generally make packages out of things that can be done in a dozen or two lines of code. The following analyzes usage of all python syntax and returns a dictionary mapping ast nodes to how many times that node came up in the source. Examples showing the number of def and class statements are below it as well.

import collections
import os
import ast

def analyze(packagedir):
    stats = collections.defaultdict(int)
    for (dirpath, dirnames, filenames) in os.walk(packagedir):
        for filename in filenames:
            if not filename.endswith('.py'):
                continue

            filename = os.path.join(dirpath, filename)

            syntax_tree = ast.parse(open(filename).read(), filename)
            for node in ast.walk(syntax_tree):
                stats[type(node)] += 1   

    return stats

print("Number of def statements:", analyze('.')[ast.FunctionDef])
print("Number of class statements:", analyze('.')[ast.ClassDef])


回答3:

Maybe Tahar can help, it displays statistics about how long each function, method, class and module are (in lines of code). However, since it's using the inspect module, it may run in unexpected ways if one of the module it analyzes launches a GUI or something like that.

I'll switch to using AST someday, although I don't know if AST can provide a service that is similar to inspect.getsourcelines() ?

(EDIT)

Mergou (the rewrite of tahar using the tokenize module) is in alpha, here's a video of it in action : http://www.youtube.com/watch?v=PI0iBZmInFU&feature=youtu.be