ActiveRecord migrations best practices

There are certain guidelines I follow when writing migrations using the ActiveRecord migrations library. The guidelines have formed during the years I've been working with Ruby on Rails.

Here is an example migration class that follows all the guidelines reviewed in this post:

class AddAreaToPlots < ActiveRecord::Migration
  class Plot < ActiveRecord::Base; end

  def up
    add_column :plots, :area, :float

    Plot.find_each do |plot|
      plot.area = plot.width * plot.height
      plot.save!
    end
  end

  def down
    remove_column :plots, :area
  end
end

Redefine ActiveRecord classes

If possible, ActiveRecord models should not be used in migrations. Migrations should be simple and failproof transformations of database schema and data. Using complex model objects that complicate the migration should be avoided.

If you have strong SQL skills, you can use it for data transformations. Well written SQL is often faster than using ActiveRecord. For schema changes I would use the ActiveRecord methods which are readable and give extra features provided by the framework (most notably the migration/rollback support talked later in this post).

If you want to use the familiar ActiveRecord API to do the data transformations, the ActiveRecord classes should be redefined. As shown in the example, the Plot class has been redefined inside the migration class. The redefinition overrides the model class from the application code and makes it a bare ActiveRecord class. This means it has no validations, associations or callbacks. If some of the callbacks should be called in the migration, those can be reimplemented in the redefined class. Same goes for methods in the model class.

Why go through this trouble? When the application changes, the models change with it. It's possible that the models change in a way that break the migrations. Or the models can be renamed or removed which will also break the migrations.

It might feel weird but it just works, makes your migrations more robust, and protects you from trouble in the future.

Use find_each

When modifying records, Model.find_each should be used to fetch the records in batches. If Model.all is used on a big database table, ActiveRecord will instantiate objects for all rows in the table which will consume all the available memory and the process starts swapping memory like crazy slowing the migration.

By default find_each loads the records in batches of 1000 but that can be configured. Only drawback of find_each is that you can't use order, offset or limit. This is because find_each orders the records by id and then uses offset and limit to control the loading of the batches. Other than these caveats, there are no drawbacks on using find_each. Filtering the records using where works just fine.

Don't change the data using the change method

The change method for migration classes was introduced in Rails 3.1 to make the writing of migrations simpler. Using change and methods that support inversing themselves, you can write migrations like this:

class AddAreaToPlots < ActiveRecord::Migration def change add_column :plots, :area, :float end end

When the migration are run by ActiveRecord, it executes the "up" version of the command adding the area column. On rollback it runs the code and executes the "down" version of the command for all the methods that have it. In the case of adding a column, rollback would remove it.

If you modify your data within the change method, ActiveRecord will run that code on both migration and rollback which is probably not what is wanted. Here is an example:

class AddAreaToPlots < ActiveRecord::Migration
  class Plot < ActiveRecord::Base; end

  def change
    add_column :plots, :area, :float

    Plot.find_each do |plot|
      plot.area = plot.width * plot.height
      plot.save!
    end
  end
end

On rollback the migration would first remove the column and then try to populate the column which would of course fail due to the missing column breaking the rollback.


In addition to the following these guideline it doesn't hurt to understand how the ActiveRecord migrations work. Read the official Active Record Migrations guide to get more information.

Do you have any best practices you follow to make your migrations better? I'd be interested hear about those in the comments.

Comments

Keep track of test suite run duration with Travis CI

I wanted to view how the test suite run duration has been changing over time on a few projects. The tests are run using the pro version of Travis CI. Travis CI is a hosted continuous integration platform which provides easy to configure way to get a continuous integration server for your projects. Luckily Travis also has an API that exposes the test build information.

So why should you be interested about test duration?

Test duration is important because if running your tests takes too much time it builds friction and you run the tests less often. When you don't run your tests regressions get committed to version control which slows down your other team members. Also if you do test-driven development (TDD) the feedback loop provided by the tests gets too slow to be useful.

If you do the test suite duration comparison often enough you can pinpoint tests that slow the suite down disproportionately to the benefit the added test provides. When you know certain test is especially slow you can think of ways to improving the speed of the test or find a way to test the functionality with another, faster way.

Pulling test suite durations from Travis

Fetching Travis data is easy using Ruby and the official Travis Ruby library. These instructions are for the pro version of Travis which is meant for private repositories. You can also use the same instructions for the open source version of Travis but you need to change the commands and library usage to remove the references to pro.

First install the Travis RubyGem:

gem install travis

Then login to Travis using it:

travis login --pro --auto

Here is the Ruby file for fetching the data:

require 'travis/pro/auto_login'

repo_name = "your-organization/your-project"

builds = Travis::Pro.repos.find(repo_name).first.builds
finished_builds = builds.select(&:finished?)
durations = finished_builds.map {|build|
  [build.started_at, build.duration, build.commit.sha]
}
File.open("stats.txt", "w") {|f|
  durations.each do |duration|
    f.puts duration.join("\t")
  end
}

The program fetches all finished builds for a repository, gets the durations and writes those to a file stats.txt separated by tabs.

To make the data a bit more readable for large projects, lets skip some data points to get to less than 100:

require 'travis/pro/auto_login'

repo_name = "your-organization/your-project"

builds = Travis::Pro.repos.find(repo_name).first.builds

finished_builds = builds.select(&:finished?)
every_n_build = (finished_builds.size / 100.to_f).ceil

smaller_dataset = finished_builds.each_slice(every_n_build).map(&:first)
durations = smaller_dataset.map {|build|
  [build.started_at, build.duration, build.commit.sha]
}

File.open("stats.txt", "w") {|f|
  durations.each do |duration|
    f.puts duration.join("\t")
  end
}

Now you can use your favorite charting tool to get something visual.

Here is a screenshot of a data from one of my apps created using Numbers.app. It was created by opening the file in Numbers.app, selecting the data and selecting bar chart from the Chart menu.

Travis CI build statistics

Comments

Guest at the Frontend Friday podcast

If you understand Finnish, you can enjoy an episode of the Frontend Friday podcast with the theme of Hypermedia APIs. I was happy to be invited to discuss about what Hypermedia APIs are, why I find them interesting, how would you start implementing them and other Hypermedia API related topics.

You can download the episode and view the shownotes at the Frontend Friday website.

Comments

Exercise is a canary of a coal mine for the body

I've noticed that for me exercise works as a canary of a coal mine for the body. When the body is getting overstressed, it is first noticeable during exercise.

Things like flu, small aches and stiffness in the body are normal in human life. Urban life and stationary jobs strain the body. Usually these smaller things pass on their own and you don't pay too much attention on them. But sometimes those are a signal of something bigger. A disease that needs treatment or overstress that you should react to.

For me exercise works great on separating the signals that are meaningful and need to be reacted to. Exercise stresses the body in different way than home and work life. I do CrossFit exercise twice a week and have been doing for three years. If for many workouts in succession I don't feel normal, it is usually a sign that it is not the normal everyday things going on. I notice it from being out of breath easily and muscle pain after workouts being greater and taking longer to pass.

I also follow my workout results (using our own workout journal app WODconnect). My memory for results is horrible so having a place for the results that makes them comparable over time is very valuable. Sometimes not being able to get the results I have been previously been capable of is also a sign of something more severe (sometimes it is just being tired from previous workouts or something else).

Listening to your body is never stupid but it is especially important when exercising.

Comments

Reading list 2013

2013 was a good reading year for me and I thought I'd share the books that influenced me the most during the year.

Overall I read 55 books (of which 10 were comic books). I use GoodReads to track my reading so you can view my full reading list there and even follow my "reviews" if you are really interested.

The books

So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love by Cal Newport

So Good They Can't Ignore You

The title sounds rather self-helpish but reading the author's Study Hacks blog made me interested. Newport has been studying remarkable careers and the book formulates rules based on the that.

Perhaps the most controversial bit is the idea that you should not follow your passion but start learning valuable skills. Newport says that passion is dangerous because having passion is so rare that it being a precursor for a great career is depressing and confusing. After you have advanced skills and knowledge on an area the passion will come automatically. Waiting for the passion to appear is not an effective strategy. Working right trumps finding the right work.

Book offers a models for improving your skills, steering your work choices to maximize control, and on formulating an underlying mission for your career. I had bunch of light bulb moments with this book and would recommend it to anyone pondering where she is going next with her work life.

Buddhism Plain and Simple by Steve Hagen

Buddhism Plain and Simple

I've been reading some Buddhist material during the past few years and I found this book to be great summary of many Buddhist ideas. It is stripped from any extra layers and presents the ideas in a clear way. I would recommend it to people who are interested in Buddhism on all knowledge levels of the subject.

Surely You're Joking, Mr. Feynman! by Richard P. Feynman

Surely You're Joking, Mr. Feynman!

Great stories from one of my favorite scientist. I've watched videos and read stories about Feynman earlier but this was the first book I read about him. I also read the second book What Do You Care What Other People Think? but didn't enjoy it as much because it was more fragmented collection of random events from his life.

Design Is a Job by Mike Monteiro

Design Is a Job

Mike Montero has a unique speaking and writing voice which I enjoy. Book is about his learnings from running his design consultancy Mule Design. Book has good points to be applied on my own work, especially on client relationships. I would recommend it to anyone running a business that serves clients with professional services.

Thinking Fast and Slow by Daniel Kahneman

Thinking Fast and Slow

The book is a summary of research done by the author on the area of cognitive biases, his work on prospect theory and on happiness. Underlying idea of most of Kahneman's research is based on separation of two modes of thinking: faster, intuitive system and slower, logical system. He goes through how the systems work in unison and what kind of behavioral patterns emerge from that.

The book is a bit long to my taste but it is easy to follow and I think knowing about the biases of our own thinking has is beneficial to everyone.

Comics

Fok_It by Joonas Rinta-Kanto (books 1-4)

Fok_It 1

Only available in Finnish. I enjoyed the whole available catalogue of Fok_It albums. It is just my kind of humour.

Habibi by Craig Thompson

Habibi

A love story set in Islamic world. The visual design is really great and with over 600 pages you get a lot to see. Anyone who enjoys great illustration and graphical storytelling should grab this one.

Extra pick

DJ-kirja by Matti Nives & Iina Esko

DJ-kirja

I actually finished reading this on 2014 but I'll include it anyway. This one is also only available in Finnish. The book includes essays about DJ culture, interviews of DJ's and a section on club flyers from Finnish point of view. As a music nerd I really enjoyed all the view points and stories from familiar faces.


Next year I'm going to aim for 50 read books and try to start a habit of writing notes or mini-reviews of the books I read. It takes some work but in the long run you get a quite nice library of good notes you can read to remind you on the important points. For example checkout out Derek Sivers' book notes with short summaries and longer notes.

Comments

More in the archive...