Unit tests - The benefit from unit tests with cont

Recently I had an interesting discussion with a colleague about unit tests. We were discussing when maintaining unit tests became less productive, when your contracts change.

Perhaps anyone can enlight me how to approach this problem. Let me elaborate:

So lets say there is a class which does some nifty calculations. The contract says that it should calculate a number, or it returns -1 when it fails for some reason.

I have contract tests who test that. And in all my other tests I stub this nifty calculator thingy.

So now I change the contract, whenever it cannot calculate it will throw a CannotCalculateException.

My contract tests will fail, and I will fix them accordingly. But, all my mocked/stubbed objects will still use the old contract rules. These tests will succeed, while they should not!

The question that rises, is that with this faith in unit testing, how much faith can be placed in such changes... The unit tests succeed, but bugs will occur when testing the application. The tests using this calculator will need to be fixed, which costs time and may even be stubbed/mocked a lot of times...

How do you think about this case? I never thought about it thourougly. In my opinion, these changes to unit tests would be acceptable. If I do not use unit tests, I would also see such bugs arise within test phase (by testers). Yet I am not confident enough to point out what will cost more time (or less).

Any thoughts?

标签： unit-testing design-by-contract

9条回答

神经病院院长

2楼-- · 2019-03-07 19:07

The first issue you raise is the so-called "fragile test" problem. You make a change to your application, and hundreds of tests break because of that change. When this happens, you have a design problem. Your tests have been designed to be fragile. They have not been sufficiently decoupled from the production code. The solution is (as it it in all software problems like this) to find an abstraction that decouples the tests from the production code in such a way that the volatility of the production code is hidden from the tests.

Some simple things that cause this kind of fragility are:

Testing for strings that are displayed. Such strings are volatile because their grammar or spelling may change at the whim of an analyst.
Testing for discrete values (e.g. 3) that should be encoded behind an abstraction (e.g. FULL_TIME).
Calling the same API from many tests. You should wrap the API call in a test function so that when the API changes you can make the change in one place.

Test design is an important issue that is often neglected by TDD beginners. This often results in fragile tests, which then leads the novices to reject TDD as "unproductive".

The second issue you raised was false positives. You have used so many mocks that none of your tests actually test the integrated system. While testing independent units is a good thing, it is also important to test partial and whole integrations of the system. TDD is not just about unit tests.

Tests should be arranged as follows:

Unit tests provide close to 100% code coverage. They test independent units. They are written by programmers using the programming language of the system.
Component tests cover ~50% of the system. They are written by business analysts and QA. They are written in a language like FitNesse, Selenium, Cucumber, etc. They test whole components, not individual units. They test primarily happy path cases and some highly visible unhappy path cases.
Integration tests cover ~20% of the system. They tests small assemblies of components as opposed to the whole system. Also written in FitNesse/Selenium/Cucumber etc. Written by architects.
System tests cover ~10% of the system. They test the whole system integrated together. Again they are written in FitNesse/Selenium/Cucumber etc. Written by architects.
Exploratory manual tests. (See James Bach) These tests are manual but not scripted. They employ human ingenuity and creativity.

0人赞添加讨论(0) 举报

Evening l夕情丶

3楼-- · 2019-03-07 19:12

I second uncle Bob's opinion that the problem is in the design. I would additionally go back one step and check the design of your contracts.

In short

instead of saying "return -1 for x==0" or "throw CannotCalculateException for x==y", underspecify niftyCalcuatorThingy(x,y) with the precondition x!=y && x!=0 in appropriate situations (see below). Thus your stubs may behave arbitrarily for these cases, your unit tests must reflect that, and you have maximal modularity, i.e. the liberty to arbitrarily change the behavior of your system under test for all underspecified cases - without the need to change contracts or tests.

Underspecification where appropriate

You can differentiate your statement "-1 when it fails for some reason" according to the following criteria: Is the scenario

an exceptional behavior that the implementation can check?
within the method's domain/responsibility?
an exception that the caller (or someone earlier in the call stack) can recover from/handle in some other way?

If and only if 1) to 3) hold, specify the scenario in the contract (e.g. that EmptyStackException is thrown when calling pop() on an empty stack).

Without 1), the implementation cannot guarantee a specific behavior in the exceptional case. For instance, Object.equals() does not specify any behavior when the condition of reflexivity, symmetry, transitivity & consistency is not met.

Without 2), SingleResponsibilityPrinciple is not met, modularity is broken and users/readers of the code get confused. For instance, Graph transform(Graph original) should not specify that MissingResourceException might be thrown because deep down, some cloning via serialization is done.

Without 3), the caller cannot make use of the specified behavior (certain return value/exception). For instance, if the JVM throws an UnknownError.

Pros and Cons

If you do specify cases where 1), 2) or 3) does not hold, you get some difficulties:

a main purpose of a (design by) contract is modularity. This is best achievable if you really separate the responsibilities: When the precondition (the responsibility of the caller) is not met, not specifying the behavior of the implementation leads to maximal modularity - as your example shows.
you don't have any liberty to change in the future, not even to a more general functionality of the method which throws exception in fewer cases
exceptional behaviors can become quite complex, so the contracts covering them become complex, error prone and hard to understand. For instance: is every situation covered? Which behavior is correct if multiple exceptional preconditions hold?

The downside of underspecification is that (testing) robustness, i.e. the implementation's ability to react appropriately to abnormal conditions, is harder.

As compromise, I like to use the following contract schema where possible:

<(Semi-)formal PRE- and POST-condition, including exceptional behavior where 1) to 3) hold>

If PRE is not met, the current implementation throws the RTE A, B or C.

0人赞添加讨论(0) 举报

虎瘦雄心在

4楼-- · 2019-03-07 19:16

Unit tests surely can not catch all bugs, even in the ideal case of 100% code / functionality coverage. I think that is not to be expected.

If the tested contract changes, I (the developer) should use my brains to update all code (including test code!) accordingly. If I fail to update some mocks which therefore still produce the old behaviour, that is my fault, not of the unit tests.

It is similar to the case when I fix a bug and produce a unit test for, but I fail to think through (and test) all similar cases, some of which later turns out to be buggy as well.

So yes, unit tests need maintenance just as well as the production code itself. Without maintenance, they decay and rot.

0人赞添加讨论(0) 举报

家丑人穷心不美

5楼-- · 2019-03-07 19:16

I look at it this way, when your contract changes, you should treat it like a new contract. Therefore, you should create a whole new set of UNIT test for this "new" contract. The fact that you have an existing set of test cases is besides the point.

0人赞添加讨论(0) 举报

看我几分像从前

6楼-- · 2019-03-07 19:18

It's better to have to fix unit test that fail due to intentional code changes than not having tests to catch the bugs that are eventually introduced by these changes.

When your codebase has a good unit test coverage, you may run into many unit test failures that are not due to bugs in the code but intentional changes on the contracts or code refactoring.

However, that unit test coverage will also give you confidence to refactor the code and implement any contract changes. Some test will fail and will need to be fixed, but other tests will eventually fail due to bugs that you introduced with these changes.

0人赞添加讨论(0) 举报

放荡不羁爱自由

7楼-- · 2019-03-07 19:23

Someone asked the same question in the Google Group for the book "Growing Object Oriented Software - Guided by Tests". The thread is Unit-test mock/stub assumptions rots.

Here is J.B. Rainsberger's answer (he is the author of Manning's "JUnit Recipes").

0人赞添加讨论(0) 举报

1 2 下一页