How can I know which parts in the code are never u

2019-01-05 07:06发布

I have legacy C++ code that I'm supposed to remove unused code from. The problem is that the code base is large.

How can I find out which code is never called/never used?

18条回答
SAY GOODBYE
2楼-- · 2019-01-05 07:19

I don't think it can be done automatically.

Even with code coverage tools, you need to provide sufficient input data to run.

May be very complex and high priced static analysis tool such as from Coverity's or LLVM compiler could be help.

But I'm not sure and I would prefer manual code review.

UPDATED

Well.. only removing unused variables, unused functions is not hard though.

UPDATED

After read other answers and comments, I'm more strongly convinced that it can't be done.

You have to know the code to have meaningful code coverage measure, and if you know that much manual editing will be faster than prepare/run/review coverage results.

查看更多
Juvenile、少年°
3楼-- · 2019-01-05 07:20

For the case of unused whole functions (and unused global variables), GCC can actually do most of the work for you provided that you're using GCC and GNU ld.

When compiling the source, use -ffunction-sections and -fdata-sections, then when linking use -Wl,--gc-sections,--print-gc-sections. The linker will now list all the functions that could be removed because they were never called and all the globals that were never referenced.

(Of course, you can also skip the --print-gc-sections part and let the linker remove the functions silently, but keep them in the source.)

Note: this will only find unused complete functions, it won't do anything about dead code within functions. Functions called from dead code in live functions will also be kept around.

Some C++-specific features will also cause problems, in particular:

  • Virtual functions. Without knowing which subclasses exist and which are actually instantiated at run time, you can't know which virtual functions you need to exist in the final program. The linker doesn't have enough information about that so it will have to keep all of them around.
  • Globals with constructors, and their constructors. In general, the linker can't know that the constructor for a global doesn't have side effects, so it must run it. Obviously this means the global itself also needs to be kept.

In both cases, anything used by a virtual function or a global-variable constructor also has to be kept around.

An additional caveat is that if you're building a shared library, the default settings in GCC will export every function in the shared library, causing it to be "used" as far as the linker is concerned. To fix that you need to set the default to hiding symbols instead of exporting (using e.g. -fvisibility=hidden), and then explicitly select the exported functions that you need to export.

查看更多
叛逆
4楼-- · 2019-01-05 07:20

The general problem of if some function will be called is NP-Complete. You cannot know in advance in a general way if some function will be called as you won't know if a Turing machine will ever stop. You can get if there's some path (statically) that goes from main() to the function you have written, but that doesn't warrant you it will ever be called.

查看更多
成全新的幸福
5楼-- · 2019-01-05 07:21

CppDepend is a commercial tool which can detect unused types, methods and fields, and do much more. It is available for Windows and Linux (but currently has no 64-bit support), and comes with a 2-week trial.

Disclaimer: I don't work there, but I own a license for this tool (as well as NDepend, which is a more powerful alternative for .NET code).

For those who are curious, here is an example built-in (customizable) rule for detecting dead methods, written in CQLinq:

// <Name>Potentially dead Methods</Name>
warnif count > 0
// Filter procedure for methods that should'nt be considered as dead
let canMethodBeConsideredAsDeadProc = new Func<IMethod, bool>(
    m => !m.IsPublic &&       // Public methods might be used by client applications of your Projects.
         !m.IsEntryPoint &&            // Main() method is not used by-design.
         !m.IsClassConstructor &&      
         !m.IsVirtual &&               // Only check for non virtual method that are not seen as used in IL.
         !(m.IsConstructor &&          // Don't take account of protected ctor that might be call by a derived ctors.
           m.IsProtected) &&
         !m.IsGeneratedByCompiler
)

// Get methods unused
let methodsUnused = 
   from m in JustMyCode.Methods where 
   m.NbMethodsCallingMe == 0 && 
   canMethodBeConsideredAsDeadProc(m)
   select m

// Dead methods = methods used only by unused methods (recursive)
let deadMethodsMetric = methodsUnused.FillIterative(
   methods => // Unique loop, just to let a chance to build the hashset.
              from o in new[] { new object() }
              // Use a hashet to make Intersect calls much faster!
              let hashset = methods.ToHashSet()
              from m in codeBase.Application.Methods.UsedByAny(methods).Except(methods)
              where canMethodBeConsideredAsDeadProc(m) &&
                    // Select methods called only by methods already considered as dead
                    hashset.Intersect(m.MethodsCallingMe).Count() == m.NbMethodsCallingMe
              select m)

from m in JustMyCode.Methods.Intersect(deadMethodsMetric.DefinitionDomain)
select new { m, m.MethodsCallingMe, depth = deadMethodsMetric[m] }
查看更多
forever°为你锁心
6楼-- · 2019-01-05 07:22

The real answer here is: You can never really know for sure.

At least, for nontrivial cases, you can't be sure you've gotten all of it. Consider the following from Wikipedia's article on unreachable code:

double x = sqrt(2);
if (x > 5)
{
  doStuff();
}

As Wikipedia correctly notes, a clever compiler may be able to catch something like this. But consider a modification:

int y;
cin >> y;
double x = sqrt((double)y);

if (x != 0 && x < 1)
{
  doStuff();
}

Will the compiler catch this? Maybe. But to do that, it will need to do more than run sqrt against a constant scalar value. It will have to figure out that (double)y will always be an integer (easy), and then understand the mathematical range of sqrt for the set of integers (hard). A very sophisticated compiler might be able to do this for the sqrt function, or for every function in math.h, or for any fixed-input function whose domain it can figure out. This gets very, very complex, and the complexity is basically limitless. You can keep adding layers of sophistication to your compiler, but there will always be a way to sneak in some code that will be unreachable for any given set of inputs.

And then there are the input sets that simply never get entered. Input that would make no sense in real life, or get blocked by validation logic elsewhere. There's no way for the compiler to know about those.

The end result of this is that while the software tools others have mentioned are extremely useful, you're never going to know for sure that you caught everything unless you go through the code manually afterward. Even then, you'll never be certain that you didn't miss anything.

The only real solution, IMHO, is to be as vigilant as possible, use the automation at your disposal, refactor where you can, and constantly look for ways to improve your code. Of course, it's a good idea to do that anyway.

查看更多
Evening l夕情丶
7楼-- · 2019-01-05 07:25

I haven't used it myself, but cppcheck, claims to find unused functions. It probably won't solve the complete problem but it might be a start.

查看更多
登录 后发表回答