Workflow for Using Django South with Multiple Code

2019-03-07 19:21发布

问题:

I'm curious as to how other Django devs manage their database migrations with South when developing with multiple code branches. Let me give a sample scenario.

Say for example you start you development with your main trunk. You create Branch A from the trunk. At this point, the last migration version for app_1 is 0010.

Then you create a migration for app_1 in the trunk that creates a migration file 0011_add_name_column. Meanwhile, in branch A, another developer creates a different migration file for the same app_1 in branch A: 0011_increase_value_column_size.

Branch A eventually gets merged back to trunk. When this happens, say last migration version in app_1 in Branch A is 0020 and in the trunk the last version is 0018, and they're all different migrations. As you can see, the state of the migration files are messed up since version 0011, when the branch was forked from trunk.. and they're all in conflicts upon merging.

According to South's tutorial, the only way to handle this case is to manually resolve all the conflicts. However, this is not really a desired solution if the amount of conflicts are significant. How do you typically handle this situation, or even to avoid it in the first place?

回答1:

Well, the answer to this is not very straightforward.

TL;DR update: In most cases, if we are talking a Trunk <-> Branch workflow I would probably

  1. "Compress" new migrations from Branch A into a single migration (or least possible)
  2. Merge all trunk changes/migrations to Branch A.
  3. Rename Branch A migrations to 0019 and so on.
  4. Now merge to trunk.

More detail

First of all, it doesn't matter if you have multiple migrations with same prefix (i.e. 0011) from merging different branches, as long as they don't modify the same models. Then you can simply run migrate with the --merge option to apply out of order migrations.

But if you have two different "migration paths" from 0011 -> 0018 and 0011 -> 0020 for the same app, even if they don't touch the same models, that's not very pretty. I think it's likely that either:

  1. Those branches have been separated for a very long time and there's a large disparity in the 2 schemas. There are 2 possible situations here:

    • They don't touch the same models (i.e they don't "intersect"): You can just apply them out of order with --merge, however it's very likely the affected models should better belong to 2 separate apps instead.

    • They do touch the same models (which I assume it's probably your case): I have to concur with @chrisdpratt here, that it's best to avoid this situation altogether by coordinating/splitting up work better. But even in this case, especially if you have only schema migrations, and you don't do some obviously conflicting schema migrations in the two branches (a stupid example would be adding a field with the same name to same model in 2 different migrations), it's pretty likely that you can just apply the migrations (or at least most of them) out of order with --merge without problems. In many cases the order of the schema migrations won't matter even if they affect the same model, you do need to check this manually however. And when it's an issue, you'll just have to change their numbering, there's no automatic way around it.

  2. You generate a new schema migration for every little schema change: There's nothing wrong with this during development, but once the feature is complete (ready for merging) the migrations should be "compressed" into a single migration (or at least less migrations if there are a lot of changes with some logical grouping, or if you also have data migrations). On dev environment which is already on latest migration, it's simple to just do

    • ./manage.py migrate myapp 0010 --fake
    • delete migrations 0011-0018
    • ./manage.py schemamigration myapp schema_changes_for_new_feature_x --auto
    • ./manage.py migrate myapp 0011 --fake --delete-ghost-migrations

Another thing is, that after merging between two branches with different migrations, you'll often need to create a mergefix schema migration (with empty forwards/backwards) methods, to record the combined state in the "frozen" models (otherwise South will think there are outstanding schema changes)



回答2:

My answer has been not committing migrations when possible. Migrations can always be regenerated if lost, so assuming no one but me needs to run my branch, just don't commit your migrations until the very end.

Short of that, the best method I've found is to simply treat them as merge conflicts. When you merge back into trunk, check your migration folder(s), and independently resolve each numbering conflict by renaming your migrations to higher numbers.

Granted, neither method is ideal, but there's not a whole lot of options on this front. South's own advice on the matter is to not develop in a vacuum. Merge often and communicate with the other developers you're working with.

South is no substitute for team coordination [...] Make sure your team know who is working on what, so they don’t write migrations that affect the same parts of the DB at the same time.

While that advice may sound frustrating on the surface, in reality, the principle is right. More than just concerning migrations, it's never really a good idea to have multiple developers working on the same bit of the system at the same time. Assign similar tasks to the same developer who is already working on that piece of the system, and you won't have any problems.