I have a large source repository split across multiple projects. I would like to produce a report about the health of the source code, identifying problem areas that need to be addressed.
Specifically, I'd like to call out routines with a high cyclomatic complexity, identify repetition, and perhaps run some lint-like static analysis to spot suspicious (and thus likely erroneous) constructs.
How might I go about constructing such a report?
Pycana works like charm when you need to understand a new project!
See how it works: http://pycana.sourceforge.net/
output:
alt text http://pycana.sourceforge.net/relations.png
For measuring cyclomatic complexity, there's a nice tool available at traceback.org. The page also gives a good overview of how to interpret the results.
+1 for pylint. It is great at verifying adherence to coding standards (be it PEP8 or your own organization's variant), which can in the end help to reduce cyclomatic complexity.
Use flake8, which provides pep8, pyflakes, and cyclomatic complexity analysis in one tool
For static analysis there is pylint and pychecker. Personally I use pylint as it seems to be more comprehensive than pychecker.
For cyclomatic complexity you can try this perl program, or this article which introduces a python program to do the same
There is a tool called CloneDigger that helps you find similar code snippets.
Thanks to Pydev, you can integrate pylint in the Eclipse IDE really easily and get a code report each time you save a modified file.