Dummy data and unit testing strategies in a modula

2019-03-13 05:02发布

问题:

How do you manage dummy data used for tests? Keep them with their respective entities? In a separate test project? Load them with a Serializer from external resources? Or just recreate them wherever needed?

We have an application stack with several modules depending on another with each containing entities. Each module has its own tests and needs dummy data to run with.

Now a module that has a lot of dependencies will need a lot of dummy data from the other modules. Those however do not publish their dummy objects because they are part of the test resources so all modules have to setup all dummy objects they need again and again.

Also: most fields in our entities are not nullable so even running transactions against the object layer requires them to contain some value, most of the time with further limitations like uniqueness, length, etc.

Is there a best practice way out of this or are all solutions compromises?


More Detail

Our stack looks something like this:

One Module:

src/main/java --> gets jared (.../entities/*.java contains the entities)
src/main/resources --> gets jared
src/test/java --> contains dummy object setup, will NOT get jared
src/test/resources --> not jared

We use Maven to handle dependencies.

module example:

  • Module A has some dummy objects
  • Module B needs its own objects AND the same as Module A

Option a)

A Test module T can hold all dummy objects and provide them in a test scope (so the loaded dependencies don't get jared) to all tests in all Modules. Will that work? Meaning: If I load T in A and run install on A will it NOT contain references introduced by T especially not B? Then however A will know about B's datamodel.

Option b)

Module A provides the dummy objects somewhere in src/main/java../entities/dummy allowing B to get them while A does not know about B's dummy data

Option c)

Every module contains external resources which are serialized dummy objects. They can be deserialized by the test environment that needs them because it has the dependency to the module to which they belong. This will require every module to create and serialize its dummy objects though and how would one do that? If with another unit test it introduces dependencies between unit tests which should never happen or with a script it'll be hard to debug and not flexible.

Option d)

Use a mock framework and assign the required fields manually for each test as needed. The problem here is that most fields in our entities are not nullable and thus will require setters or constructors to be called which would end us up at the start again.

What we don't want

We don't want to set up a static database with static data as the required objects' structure will constantly change. A lot right now, a little later. So we want hibernate to set up all tables and columns and fill those with data at unit testing time. Also a static data base would introduce a lot of potential errors and test interdependencies.


Are my thoughts going in the right direction? What's the best practice to deal with tests that require a lot of data? We'll have several interdependent modules that will require objects filled with some kind of data from several other modules.


EDIT

Some more info on how we're doing it right now in response to the second answer:

So for simplicity, we have three modules: Person, Product, Order. Person will test some manager methods using a MockPerson object:

(in person/src/test/java:)

public class MockPerson {

    public Person mockPerson(parameters...) {
        return mockedPerson;
    }
}

public class TestPerson() {
    @Inject
    private MockPerson mockPerson;
    public testCreate() {
        Person person = mockPerson.mockPerson(...);
        // Asserts...
    }
}

The MockPerson class will not be packaged.

The same applies for the Product Tests:

(in product/src/test/java:)

public class MockProduct() { ... }
public class TestProduct {
    @Inject
    private MockProduct mockProduct;
    // ...
}

MockProduct is needed but will not be packaged.

Now the Order Tests will require MockPerson and MockProduct, so now we currently need to create both as well as MockOrder to test Order.

(in order/src/test/java:)

These are duplicates and will need to be changed every time Person or Product changes

public class MockProduct() { ... }
public class MockPerson() { ... }

This is the only class that should be here:

public class MockOrder() { ... }

public class TestOrder() {
    @Inject
    private order.MockPerson mockPerson;
    @Inject
    private order.MockProduct mockProduct;
    @Inject
    private order.MockOrder mockOrder;
    public testCreate() {

        Order order = mockOrder.mockOrder(mockPerson.mockPerson(), mockProduct.mockProduct());
        // Asserts...
    }
}

The problem is, that now we have to update person.MockPerson and order.MockPerson whenever Person is changed.

Isn't it better to just publish the Mocks with the jar so that every other test that has the dependency anyway can just call Mock.mock and get a nicely setup object? Or is this the dark side - the easy way?

回答1:

This may or may not apply - I'm curious to see an example of your dummy objects and the setup code related. (To get a better idea of whether it applies to your situation.) But what I've done in the past is not even introduce this kind of code into the tests at all. As you describe, it's hard to produce, debug, and especially package and maintain.

What I've usaully done (and AFAIKT in Java this is the best practice) is try to use the Test Data Builder pattern, as described by Nat Pryce in his Test Data Builders post.

If you think this is somewhat relevant, check these out:

  • Does a framework like Factory Girl exist for Java?
  • make-it-easy, Nat's framework that implements this pattern.


回答2:

Well, I read carefully all evaluations so far, and it is very good question. I see following approaches to the problem:

  1. Set up (static) test data base;
  2. Each test has it's own set up data that creates (dynamic) test data prior to running unit tests;
  3. Use dummy or mock object. All modules know all dummy objects, this way there is no duplicates;
  4. Reduce the scope of the unit test;

First option is pretty straight forward and has many drawbacks, somebody has to reproduce it's once in while, when unit tests "mess it up", if there are changes in the data-module, somebody has to introduce corresponding changes to the test data, a lot of maintenance overhead. Not to say that generation of this data on the first hand maybe tricky. See aslo second option.

Second option, you write your test code that prior to the testing invokes some of your "core" business methods that creates your entity. Ideally, your test code should be independent from the production code, but in this case, you will end up with duplicate code, that you should support twice. Sometimes, it is good to split your production business method in order to have entry point for your unit test (I makes such methods private and use Reflection to invoke them, also some remark on the method is needed, refactoring is now a bit tricky). The main drawback that if you must change your "core" business methods it suddenly effects all of your unit test and you can't test. So, developers should be aware of it and not make partials commits to the "core" business methods, unless they works. Also, with any change in this area, you should keep in your mind "how it will affect my unit test". Sometimes also, it is impossible to reproduce all the required data dynamically (usually, it is because of the third-parties API, for example, you call another application with it's own DB from which you required to use some keys. This keys (with the associated data) is created manually through third-party application. In such a case, this data and only this data, should be created statically. For example, your created 10000 keys starting from 300000.

Third option should be good. Options a) and d) sounds for me pretty good. For your dummy object you can use the mock framework or you can not to use it. Mock Framework is here only to help you. I don't see problem that all of your unit know all your entities.

Fourth option means that you redefine what is "unit" in your unit test. When you have couple of modules with interdependence than it can be difficult to test each module in isolation. This approach says, that what we originally tested was integration test and not unit test. So, we split our methods, extract small "units of works" that receives all it's interdependences to another modules as parameters. This parameters can be (hopefully) easily mocked up. The main drawback of this approach, that you don't test all of your code, but only so to say, the "focal points". You need to make integration test separately (usually by QA team).



回答3:

I'm wondering if you couldn't solve your problem by changing your testing approach.

Unit Testing a module which depends on other modules and, because of that, on the test data of other modules is not a real unit test!

What if you would inject a mock for all of the dependencies of your module under test so you can test it in complete isolation. Then you don't need to setup a complete environment where each depending module has the data it needs, you only setup the data for the module your actually testing.

If you imagine a pyramid, then the base would be your unit tests, above that you have functional tests and at the top you have some scenario tests (or as Google calls them, small, medium and big tests).

You will have a huge amount of Unit Tests that can test every code path because the mocked dependencies are completely configurable. Then you can trust in your individual parts and the only thing that your functional and scenario tests will do is test if each module is wired correctly to other modules.

This means that your module test data is not shared by all your tests but only by a few that are grouped together.

The Builder Pattern as mentioned by cwash will definitely help in your functional tests. We are using a .NET Builder that is configured to build a complete object tree and generate default values for each property so when we save this to the database all required data is present.