Sometimes, data migrations are required. As time passes, code changes and migrations using your domain model are no longer valid and migrations fail. What are the best practices for migrating data?
I tried make an example to clarify the problem:
Consider this. You have a migration
class ChangeFromPartnerAppliedToAppliedAt < ActiveRecord::Migration
def up
User.all.each do |user|
user.applied_at = user.partner_application_at
user.save
end
end
this runs perfectly fine, of course. Later, you need a schema change
class AddAcceptanceConfirmedAt < ActiveRecord::Migration
def change
add_column :users, :acceptance_confirmed_at, :datetime
end
end
class User < ActiveRecord::Base
before_save :do_something_with_acceptance_confirmed_at
end
For you, no problem. It runs perfectly. But if your coworker pulls both these today, not having run the first migration yet, he'll get this error on running the first migration:
rake aborted!
An error has occurred, this and all later migrations canceled:
undefined method `acceptance_confirmed_at=' for #<User:0x007f85902346d8>
That's not being a team player, he'll be fixing the bug you introduced. What should you have done?
Best practice is: don't use models in migrations. Migrations change the way AR maps, so do not use them at all. Do it all with SQL. This way it will always work.
This:
User.all.each do |user|
user.applied_at = user.partner_application_at
user.save
end
I would do like this
update "UPDATE users SET applied_at=partner_application_at"
This is a perfect example of the Using Models in Your Migrations
class ChangeFromPartnerAppliedToAppliedAt < ActiveRecord::Migration
class User < ActiveRecord::Base
end
def up
User.all.each do |user|
user.applied_at = user.partner_application_at
user.save
end
end
Edited after Mischa's comment
class ChangeFromPartnerAppliedToAppliedAt < ActiveRecord::Migration
class User < ActiveRecord::Base
end
def up
User.update_all('applied_at = partner_application_at')
end
end
Some times 'migrating data' could not be performed as a part of schema migration, like discussed above. Sometimes 'migrating data' means 'fix historical data inconstancies' or 'update your Solr/Elasticsearch' index, so its a complex task. For these kind of tasks, check out this gem https://github.com/OffgridElectric/rails-data-migrations
This gem was designed to decouple Rails schema migrations from data migrations, so it wont cause downtimes at deploy time and make it easy to manage in overall