Our management has recently been talking to some people selling C++ static analysis tools. Of course the sales people say they will find tons of bugs, but I'm skeptical.
How do such tools work in the real world? Do they find real bugs? Do they help more junior programmers learn?
Are they worth the trouble?
I've used them - PC-Lint, for example, and they did find some things. Typically they are configurable and you can tell them 'stop bothering me about xyz', if you determine that xyz really isn't an issue.
I don't know that they help junior programmers learn a lot, but they can be used as a mechanism to help tighten up the code.
I've found that a second set of (skeptical, probing for bugs) eyes and unit testing is typically where I've seen more bug catching take place.
I think static code analysis is well worth, if you are using the right tool. Recently, we tried the Coverity Tool ( bit expensive). Its awesome, it brought out many critical defects,which were not detected by lint or purify.
Also we found that, we could have avoided 35% of the customer Field defects, if we had used coverity earlier.
Now, Coverity is rolled out in my company and when ever we get a customer TR in old software version, we are running coverity against it to bring out the possible canditates for the fault before we start the analysis in a susbsytem.
As a couple people remarked, if you run a static analysis tool full bore on most applications, you will get a lot of warnings, some of them may be false positives or may not lead to an exploitable defect. It is that experience that leads to a perception that these types of tools are noisy and perhaps a waste of time. However, there are warnings that will highlight a real and potentially dangerous defects that can lead to security, reliability, or correctness issues and for many teams, those issues are important to fix and may be nearly impossible to discover via testing.
That said, static analysis tools can be profoundly helpful, but applying them to an existing codebase requires a little strategy. Here are a couple of tips that might help you..
1) Don't turn everything on at once, decide on an initial set of defects, turn those analyses on and fix them across your code base.
2) When you are addressing a class of defects, help your entire development team to understand what the defect is, why it's important and how to code to defend against that defect.
3) Work to clear the codebase completely of that class of defects.
4) Once this class of issues have been fixed, introduce a mechanism to stay in that zero issue state. Luckily, it is much easier make sure you are not re-introducing an error if you are at a baseline has no errors.
Static code analysis is almost always worth it. The issue with an existing code base is that it will probably report far too many errors to make it useful out of the box.
I once worked on a project that had 100,000+ warnings from the compiler... no point in running Lint tools on that code base.
Using Lint tools "right" means buying into a better process (which is a good thing). One of the best jobs I had was working at a research lab where we were not allowed to check in code with warnings.
So, yes the tools are worth it... in the long term. In the short term turn your compiler warnings up to the max and see what it reports. If the code is "clean" then the time to look at lint tools is now. If the code has many warnings... prioritize and fix them. Once the code has none (or at least very few) warnings then look at Lint tools.
So, Lint tools are not going to help a poor code base, but once you have a good codebase it can help you keep it good.
Edit:
In the case of the 100,000+ warning product, it was broken down into about 60 Visual Studio projects. As each project had all of the warnings removed it was changed so that the warnings were errors, that prevented new warnings from being added to projects that had been cleaned up (or rather it let my co-worker righteously yell at any developer that checked in code without compiling it first :-)
You are probably going to have to deal with a good amount of false positives, particularly if your code base is large.
Most static analysis tools work using "intra-procedural analysis", which means that they consider each procedure in isolation, as opposed to "whole-program analysis" which considers the entire program.
They typically use "intra-procedural" analysis because "whole-program analysis" has to consider many paths through a program that won't actually ever happen in practice, and thus can often generate false positive results.
Intra-procedural analysis eliminates those problems by just focusing on a single procedure. In order to work, however, they usually need to introduce an "annotation language" that you use to describe meta-data for procedure arguments, return types, and object fields. For C++ those things are usually implemented via macros that you decorate things with. The annotations then describe things like "this field is never null", "this string buffer is guarded by this integer value", "this field can only be accessed by the thread labeled 'background'", etc.
The analysis tool will then take the annotations you supply and verify that the code you wrote actually conforms to the annotations. For example, if you could potentially pass a null off to something that is marked as not null, it will flag an error.
In the absence of annotations, the tool needs to assume the worst, and so will report a lot of errors that aren't really errors.
Since it appears you are not using such a tool already, you should assume you are going to have to spend a considerably amount of time annotating your code to get rid of all the false positives that will initially be reported. I would run the tool initially, and count the number of errors. That should give you an estimate of how much time you will need to adopt it in your code base.
Wether or not the tool is worth it depends on your organization. What are the kinds of bugs you are bit by the most? Are they buffer overrun bugs? Are they null-dereference or memory-leak bugs? Are they threading issues? Are they "oops we didn't consider that scenario", or "we didn't test a Chineese version of our product running on a Lithuanian version of Windows 98?".
Once you figure out what the issues are, then you should know if it's worth the effort.
The tool will probably help with buffer overflow, null dereference, and memory leak bugs. There's a chance that it may help with threading bugs if it has support for "thread coloring", "effects", or "permissions" analysis. However, those types of analysis are pretty cutting-edge, and have HUGE notational burdens, so they do come with some expense. The tool probably won't help with any other type of bugs.
So, it really depends on what kind of software you write, and what kind of bugs you run into most frequently.
This rather amazing result was accomplished using Elsa and Oink.
http://www.cs.berkeley.edu/~daw/papers/fmtstr-plas07.pdf
"Large-Scale Analysis of Format String Vulnerabilities in Debian Linux" by Karl Chen, David Wagner, UC Berkeley, {quarl, daw}@cs.berkeley.edu
Abstract:
Format-string bugs are a relatively common security vulnerability, and can lead to arbitrary code execution. In collaboration with others, we designed and implemented a system to eliminate format string vulnerabilities from an entire Linux distribution, using typequalifier inference, a static analysis technique that can find taint violations. We successfully analyze 66% of C/C++ source packages in the Debian 3.1 Linux distribution. Our system finds 1,533 format string taint warnings. We estimate that 85% of these are true positives, i.e., real bugs; ignoring duplicates from libraries, about 75% are real bugs. We suggest that the technology exists to render format string vulnerabilities extinct in the near future.
Categories and Subject Descriptors D.4.6 [Operating Systems]: Security and Protection—Invasive Software; General Terms: Security, Languages; Keywords: Format string vulnerability, Large-scale analysis, Typequalifier inference