ActiveRecord migrations best practices

02 NOV 2014

There are certain guidelines I follow when writing migrations using the ActiveRecord migrations library. The guidelines have formed during the years I’ve been working with Ruby on Rails.

Here is an example migration class that follows all the guidelines reviewed in this post:

class AddAreaToPlots < ActiveRecord::Migration
  class Plot < ActiveRecord::Base; end

  def up
    add_column :plots, :area, :float

    Plot.find_each do |plot|
      plot.area = plot.width * plot.height
      plot.save!
    end
  end

  def down
    remove_column :plots, :area
  end
end

Redefine ActiveRecord classes

If possible, ActiveRecord models should not be used in migrations. Migrations should be simple and failproof transformations of database schema and data. Using complex model objects that complicate the migration should be avoided.

If you have strong SQL skills, you can use it for data transformations. Well written SQL is often faster than using ActiveRecord. For schema changes I would use the ActiveRecord methods which are readable and give extra features provided by the framework (most notably the migration/rollback support talked later in this post).

If you want to use the familiar ActiveRecord API to do the data transformations, the ActiveRecord classes should be redefined. As shown in the example, the Plot class has been redefined inside the migration class. The redefinition overrides the model class from the application code and makes it a bare ActiveRecord class. This means it has no validations, associations or callbacks. If some of the callbacks should be called in the migration, those can be reimplemented in the redefined class. Same goes for methods in the model class.

Why go through this trouble? When the application changes, the models change with it. It’s possible that the models change in a way that break the migrations. Or the models can be renamed or removed which will also break the migrations.

It might feel weird but it just works, makes your migrations more robust, and protects you from trouble in the future.

Use find_each

When modifying records, Model.find_each should be used to fetch the records in batches. If Model.all is used on a big database table, ActiveRecord will instantiate objects for all rows in the table which will consume all the available memory and the process starts swapping memory like crazy slowing the migration.

By default find_each loads the records in batches of 1000 but that can be configured. Only drawback of find_each is that you can’t use order, offset or limit. This is because find_each orders the records by id and then uses offset and limit to control the loading of the batches. Other than these caveats, there are no drawbacks on using find_each. Filtering the records using where works just fine.

Don’t change the data using the change method

The change method for migration classes was introduced in Rails 3.1 to make the writing of migrations simpler. Using change and methods that support inversing themselves, you can write migrations like this:

class AddAreaToPlots < ActiveRecord::Migration def change add_column :plots, :area, :float end end

When the migration are run by ActiveRecord, it executes the “up” version of the command adding the area column. On rollback it runs the code and executes the “down” version of the command for all the methods that have it. In the case of adding a column, rollback would remove it.

If you modify your data within the change method, ActiveRecord will run that code on both migration and rollback which is probably not what is wanted. Here is an example:

class AddAreaToPlots < ActiveRecord::Migration
  class Plot < ActiveRecord::Base; end

  def change
    add_column :plots, :area, :float

    Plot.find_each do |plot|
      plot.area = plot.width * plot.height
      plot.save!
    end
  end
end

On rollback the migration would first remove the column and then try to populate the column which would of course fail due to the missing column breaking the rollback.

In addition to the following these guideline it doesn’t hurt to understand how the ActiveRecord migrations work. Read the official Active Record Migrations guide to get more information.

Do you have any best practices you follow to make your migrations better? I’d be interested hear about those in the comments.

Addition: Migrations should not contain seed data

Seed data refers to the initial data the database must have for the application to function correctly at all. Rails provides a [mechanism for populating the database with seed data] which should be used. Migrations are meant only to provide the structure of the database, not the contents.

If migrations are used for seed data, there is a chance that a database is created without the seed data. When the incorrect migrations get older and a new database is bootstrapped using rake db:schema:load it uses the db/schema.rb file to get the right structure, but the data from the migrations won’t be created in the database.

Thanks for Christian Hellsten for the addition!

Vesa Vänskä

Elsewhere

ActiveRecord migrations best practices

Redefine ActiveRecord classes

Use find_each

Don’t change the data using the change method

Addition: Migrations should not contain seed data