Changing Database schemas & unit tests

2019-06-19 02:37发布

问题:

     Before we start I know a fair few people consider tests that hit the database not "unit tests". Maybe "integration tests" would be a better name. Either way developer tests that hit the database.

     To enable unit-testing I have a developer local database which I clear and the populate with a know set of data at the start of each test using dbUnit. This all works well enough until a table used by the test changes in some way and I have to manually update all the XML datasets. Which is a pain. I figure other people must have hit the same problem and hopefully found a nice neat solution to it. So for tests that require populating a database what do you use and how do you handle table definitions changing? (While I use Java I am open to solutions utilizing different technologies.)

EDIT: To clarify a little. I have a contrived test like:

void testLoadRevision() {
    database.clear(); // Clears every table dbUnit knows about.
    database.load("load/trevision.xml", "load/tissue.xml");
    SomeDatabaseThingie subject = new SomeDatabaseThingie(databaseProvider);
    Revision actual = subject.load();
    assert(actual, expected);
}

In that I have two tables - tRevision and tIssue. A loaded revision uses a small amount of data from tIssue. Later on tIssue acquires a new field that revisions do not care about. As the new field is "not null" and has no sensible default this test it will fail as the tIssue.xml will be invalid.

With small changes like this it is not too hard to edit the tIssue. But when the number of XML files starts to balloon with each flow it becomes a large amount of work.

Cheers,
    mlk

回答1:

Well, as I see it, it is a matter of combining what is already there.

The scenario described above:

  1. Write a database migration
  2. Apply the database migration (manually or automatically at the start of the test run)
  3. Watch your tests break due to a constraint violation (not null)

You could extend it so that you a small program that does the following:

  1. Populate the database with the DbUnit XML
  2. Apply the database migration
  3. Extract the contents of your database in-place in your DbUnit XML (and optionally also the DTD) (See DbUnit Home Page -> DbUnit FAQ -> How to extract a flat XML dataset from my database?)
  4. Check your updated DbUnit XML (and DTD) into source control.

For applying the migration, I heartily recommend Flyway. It supports both Sql (with placeholder replacement) and Java-based migrations. You can then apply the migrations using the Maven Plugin or programmatically using the API. The latter fits this case perfectly.

The complete workflow then becomes:

  1. Write your database migration
  2. Execute your DbUnitXmlDtdUpdater program
  3. Watch your unit tests pass

Happy days,

Axel

Disclaimer: I am one of Flyway's developers.



回答2:

I think the answer to this question comes in two phases:

There is only one authoritative definition of the schema

There should be only one definition of what the database looks like. In normal cases, I prefer to have a SQL DDL script that specifies the schema for the database.

The unit tests should use the same authoritative definition of the database schema as the application uses, and it should create the database based on that definition before the test run and remove it completely again after the test run.

That said, tooling may come out of sync with the schema, and you will manually need to update the tool-generated stuff. For example, I use the Entity Framework for .NET that auto-generates classes based on the database schema. When I change the schema, I need to manually tell my tool to update these classes. It's a pain, but I'm not aware of any way out of that, unless the tooling supports automation.

Each test should start with empty data

Each test should start with the database without any data. Every test should populate only the data it needs to execute the test, and when it is done, it should clean out the database again.

What you are currently doing sounds like an anti-pattern called General Fixture, where you try to pre-load a set of data that represents as broad a set of scenarios as possible. However, it makes it very hard to test mutually exclusive conditions, and may also lead to Test Interdependence if you modify this pre-loaded data in some tests.

This is really well explained in the excellent book xUnit Test Patterns.



回答3:

I have the same issue of dbunit xml flat files running out-of-sync upon database schema evolutions which do require data changes (even for things as simple as the addition of mandatory columns).

While transforming all the xml files using some hand-written scripts is an option, I still think the issue should be solved at a different abstraction level more similar to the way one handles live data: evolutionary db design.

Database migration tools already know about delta scripts, so having a sort of dbunit adapter would be great.

If found the following blog entry covering the issue: http://blog.liquibase.org/2007/06/unit-testing-the-database-access-layer.html

To solve the problem of keeping test data definitions from getting out of sync with the schema, you need to have your test data built up along with your database so it will be modified by database refactoring that were made after it was initially created. [..] By including the test data with your database changes, the data is automatically kept up in the same way the production data would be. Using this technique over the dataset per method also has the advantage of performing better because the data is only inserted once,...

but adds himself:

but it has the disadvantage that you need to handle all the test data that any method would want in one place.

...which in turn is not possible for more complex scenarios I guess. He goes on saying:

To facilitate this technique, I built the idea of execution contexts into LiquiBase so you can mark the test data changes and only apply them in environments where you run unit tests. So far I have been happy with the results. The tests fail when there is a differences between the database schema and what the code is expecting or when there is a bug in my SQL and I haven’t lost any tests due to database refactorings.

Here's the link: www.liquibase.org/manual/contexts but it's not what I want at least, although I would be fine with exposing my testdata to a db migration tool, I still like to keep it very close the the database test.

Thoughts anyone?