Converting my current code project to TDD, I've noticed something.
class Foo {
public event EventHandler Test;
public void SomeFunction() {
//snip...
Test(this, new EventArgs());
}
}
There are two dangers I can see when testing this code and relying on a code coverage tool to determine if you have enough tests.
- You should be testing if the
Test
event gets fired. Code coverage tools alone won't tell you if you forget this.
- I'll get to the other in a second.
To this end, I added an event handler to my startup function so that it looked like this:
Foo test;
int eventCount;
[Startup] public void Init() {
test = new Foo();
// snip...
eventCount = 0;
test.Test += MyHandler;
}
void MyHandler(object sender, EventArgs e) { eventCount++; }
Now I can simply check eventCount
to see how many times my event was called, if it was called. Pretty neat. Only now we've let through an insidious little bug that will never be caught by any test: namely, SomeFunction()
doesn't check if the event has any handlers before trying to call it. This will cause a null dereference, which will never be caught by any of our tests because they all have an event handler attached by default. But again, a code coverage tool will still report full coverage.
This is just my "real world example" at hand, but it occurs to me that plenty more of these sorts of errors can slip through, even with 100% 'coverage' of your code, this still doesn't translate to 100% tested. Should we take the coverage reported by such a tool with a grain of salt when writing tests? Are there other sorts of tools that would catch these holes?
I wouldn't say "take it with a grain of salt" (there is a lot of utility to code coverage), but to quote myself
TDD and code coverage are not a
panacea:
· Even with 100% block
coverage, there still will be errors
in the conditions that choose which
blocks to execute.
· Even with 100% block
coverage + 100% arc coverage, there
will still be errors in straight-line
code.
· Even with 100% block
coverage + 100% arc coverage + 100%
error-free-for-at-least-one-path
straight-line code, there will still
be input data that executes
paths/loops in ways that exhibit more
bugs.
(from here)
While there may be some tools that can offer improvement, I think the higher-order bit is that code coverage is only part of an overall testing strategy to ensure product quality.
<100% code coverage is bad, but it doesn't follow that 100% code coverage is good. It's a necessary but not sufficient condition, and should be treated as such.
Also note that there's a difference between code coverage and path coverage:
void bar(Foo f) {
if (f.isGreen()) accountForGreenness();
if (f.isBig()) accountForBigness();
finishBar(f);
}
If you pass a big, green Foo into that code as a test case, you get 100% code coverage. But for all you know a big, red Foo would crash the system because accountForBigness incorrectly assumes that some pointer is non-null, that is only made non-null by accountForGreenness. You didn't have 100% path coverage, because you didn't cover the path which skips the call to accountForGreenness but not the call to accountForBigness.
It's also possible to get 100% branch coverage without 100% path coverage. In the above code, one call with a big, green Foo and one with a small, red Foo gives the former but still doesn't catch the big, red bug.
Not that this example is the best OO design ever, but it's rare to see code where code coverage implies path coverage. And even if it does imply that in your code, it doesn't imply that all code or all paths in library or system are covered, that your program could possibly use. You would in principle need 100% coverage of all the possible states of your program to do that (and hence make sure that for example in no case do you call with invalid parameters leading to error-catching code in the library or system not otherwise attained), which is generally infeasible.
Should we take the coverage reported by such a tool with a grain of salt when writing tests?
Absolutely. The coverage tool only tells you what proportion of lines in your code were actually run during tests. It doesn't say anything about how thoroughly those lines were tested. Some lines of code need to be tested only once or twice, but some need to be tested over a wide range of inputs. Coverage tools can't tell the difference.
Also, a 100% test coverage as such does not mean much if the test driver just exercised the code without meaningful assertions regarding the correctness of the results.
Coverage is only really useful for identifying code that hasn't been tested at all. It doesn't tell you much about code that has been covered.
Yes, this is the primary different between "line coverage" and "path coverage". In practice, you can't really measure code path coverage. Like static compile time checks, unit tests and static analysis -- line coverage is just one more tool to use in your quest for quality code.
Testing is absolutly necessary. What must be consitent too is the implementation.
If you implement something in a way that have not been in your tests... it's there that the problem may happen.
Problem may also happen when the data you test against is not related to the data that is going to be flowing through your application.
So Yes, code coverage is necessary. But not as much as real test performed by real person.