I have legacy C++ code that I'm supposed to remove unused code from. The problem is that the code base is large.
How can I find out which code is never called/never used?
I have legacy C++ code that I'm supposed to remove unused code from. The problem is that the code base is large.
How can I find out which code is never called/never used?
I don't think it can be done automatically.
Even with code coverage tools, you need to provide sufficient input data to run.
May be very complex and high priced static analysis tool such as from Coverity's or LLVM compiler could be help.
But I'm not sure and I would prefer manual code review.
UPDATED
Well.. only removing unused variables, unused functions is not hard though.
UPDATED
After read other answers and comments, I'm more strongly convinced that it can't be done.
You have to know the code to have meaningful code coverage measure, and if you know that much manual editing will be faster than prepare/run/review coverage results.
For the case of unused whole functions (and unused global variables), GCC can actually do most of the work for you provided that you're using GCC and GNU ld.
When compiling the source, use
-ffunction-sections
and-fdata-sections
, then when linking use-Wl,--gc-sections,--print-gc-sections
. The linker will now list all the functions that could be removed because they were never called and all the globals that were never referenced.(Of course, you can also skip the
--print-gc-sections
part and let the linker remove the functions silently, but keep them in the source.)Note: this will only find unused complete functions, it won't do anything about dead code within functions. Functions called from dead code in live functions will also be kept around.
Some C++-specific features will also cause problems, in particular:
In both cases, anything used by a virtual function or a global-variable constructor also has to be kept around.
An additional caveat is that if you're building a shared library, the default settings in GCC will export every function in the shared library, causing it to be "used" as far as the linker is concerned. To fix that you need to set the default to hiding symbols instead of exporting (using e.g.
-fvisibility=hidden
), and then explicitly select the exported functions that you need to export.The general problem of if some function will be called is NP-Complete. You cannot know in advance in a general way if some function will be called as you won't know if a Turing machine will ever stop. You can get if there's some path (statically) that goes from main() to the function you have written, but that doesn't warrant you it will ever be called.
CppDepend is a commercial tool which can detect unused types, methods and fields, and do much more. It is available for Windows and Linux (but currently has no 64-bit support), and comes with a 2-week trial.
Disclaimer: I don't work there, but I own a license for this tool (as well as NDepend, which is a more powerful alternative for .NET code).
For those who are curious, here is an example built-in (customizable) rule for detecting dead methods, written in CQLinq:
The real answer here is: You can never really know for sure.
At least, for nontrivial cases, you can't be sure you've gotten all of it. Consider the following from Wikipedia's article on unreachable code:
As Wikipedia correctly notes, a clever compiler may be able to catch something like this. But consider a modification:
Will the compiler catch this? Maybe. But to do that, it will need to do more than run
sqrt
against a constant scalar value. It will have to figure out that(double)y
will always be an integer (easy), and then understand the mathematical range ofsqrt
for the set of integers (hard). A very sophisticated compiler might be able to do this for thesqrt
function, or for every function in math.h, or for any fixed-input function whose domain it can figure out. This gets very, very complex, and the complexity is basically limitless. You can keep adding layers of sophistication to your compiler, but there will always be a way to sneak in some code that will be unreachable for any given set of inputs.And then there are the input sets that simply never get entered. Input that would make no sense in real life, or get blocked by validation logic elsewhere. There's no way for the compiler to know about those.
The end result of this is that while the software tools others have mentioned are extremely useful, you're never going to know for sure that you caught everything unless you go through the code manually afterward. Even then, you'll never be certain that you didn't miss anything.
The only real solution, IMHO, is to be as vigilant as possible, use the automation at your disposal, refactor where you can, and constantly look for ways to improve your code. Of course, it's a good idea to do that anyway.
I haven't used it myself, but cppcheck, claims to find unused functions. It probably won't solve the complete problem but it might be a start.