Database data needed in integration tests; created

2019-01-21 00:31发布

This question is more or less programming language agnostic. However as I'm mostly into Java these days that's where I'll draw my examples from. I'm also thinking about the OOP case, so if you want to test a method you need an instance of that methods class.

A core rule for unit tests is that they should be autonomous, and that can be achieved by isolating a class from its dependencies. There are several ways to do it and it depends on if you inject your dependencies using IoC (in the Java world we have Spring, EJB3 and other frameworks/platforms which provide injection capabilities) and/or if you mock objects (for Java you have JMock and EasyMock) to separate a class being tested from its dependencies.

If we need to test groups of methods in different classes* and see that they are well integration, we write integration tests. And here is my question!

  • At least in web applications, state is often persisted to a database. We could use the same tools as for unit tests to achieve independence from the database. But in my humble opinion I think that there are cases when not using a database for integration tests is mocking too much (but feel free to disagree; not using a database at all, ever, is also a valid answer as it makes the question irrelevant).
  • When you use a database for integration tests, how do you fill that database with data? I can see two approaches:
    • Store the database contents for the integration test and load it before starting the test. If it's stored as an SQL dump, a database file, XML or something else would be interesting to know.
    • Create the necessary database structures by API calls. These calls are probably split up into several methods in your test code and each of these methods may fail. It could be seen as your integration test having dependencies on other tests.

How are you making certain that database data needed for tests is there when you need it? And why did you choose the method you choose?

Please provide an answer with a motivation, as it's in the motivation the interesting part lies. Remember that just saying "It's best practice!" isn't a real motivation, it's just re-iterating something you've read or heard from someone. For that case please explain why it's best practice.

*I'm including one method calling other methods in (the same or other) instances of the same class in my definition of unit test, even though it might technically not be correct. Feel free to correct me, but let's keep it as a side issue.

8条回答
smile是对你的礼貌
2楼-- · 2019-01-21 01:16

I've used DBUnit to take snapshots of records in a database and store them in XML format. Then my unit tests (we called them integration tests when they required a database), can wipe and restore from the XML file at the start of each test.

I'm undecided whether this is worth the effort. One problem is dependencies on other tables. We left static reference tables alone, and built some tools to detect and extract all child tables along with the requested records. I read someone's recommendation to disable all foreign keys in your integration test database. That would make it way easier to prepare the data, but you're no longer checking for any referential integrity problems in your tests.

Another problem is database schema changes. We wrote some tools that would add default values for columns that had been added since the snapshots were taken.

Obviously these tests were way slower than pure unit tests.

When you're trying to test some legacy code where it's very difficult to write unit tests for individual classes, this approach may be worth the effort.

查看更多
够拽才男人
3楼-- · 2019-01-21 01:18

This will probably not answer all your questions, if any, but I made the decision in one project to do unit testing against the DB. I felt in my case that the DB structure needed testing too, i.e. did my DB design deliver what is needed for the application. Later in the project when I feel the DB structure is stable, I will probably move away from this.

To generate data I decided to create an external application that filled the DB with "random" data, I created a person-name and company-name generators etc.

The reason for doing this in an external program was: 1. I could rerun the tests on by test modified data, i.e. making sure my tests were able to run several times and the data modification made by the tests were valid modifications. 2. I could if needed, clean the DB and get a fresh start.

I agree that there are points of failure in this approach, but in my case since e.g. person generation was part of the business logic generating data for tests was actually testing that part too.

查看更多
登录 后发表回答