Add a #populate method to migrations #31082

pedantic-git · 2017-11-07T13:05:51Z

This is a feature suggestion, and my first time contributing to the core, so I'm very open to comments about whether this is a good idea, etc.

I often find myself writing migrations that look something like this:

class AddPublishedToPosts < ActiveRecord::Migration[5.2]
  def change
    add_column :posts, :published, :boolean, default: false
    reversible do |dir|
      dir.up { Post.update_all(published: true) }
    end
  end
end

That is, I'm using one half of a #reversible block to prepopulate the existing records with appropriate values for the new column. (In the example above, we assume all existing posts are already published, but new posts are unpublished by default.)

It doesn't seem very Railsy because it's not really a reversible operation - it's an operation that only happens on the way up and is irrelevant on the way down because the column ceases to exist.

So this PR adds a new #populate method for this use case which simplifies the above migration slightly to:

class AddPublishedToPosts < ActiveRecord::Migration[5.2]
  def change
    add_column :posts, :published, :boolean, default: false
    populate { Post.update_all(published: true) }
  end
end

Thoughts?

rails-bot · 2017-11-07T13:05:55Z

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @sgrif (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

pedantic-git · 2017-11-07T14:03:03Z

Looks like the MySQL iterations of the test are failing for some reason I can't quite discern. I suspect it's an error in how I've coded the test, rather than an error in the functionality.

rafaelfranca · 2017-11-07T17:05:23Z

Thank you for the pull request. Although the code is good I'm not too much inclined to advertising doing data migrations in the same migrations as the structure migration. Data migrations are problematic and I'd not do them in the same time I'd doing a structure change. If anything does wrong for example, when using PostgreSQL, your entire migration will be reverted. Other problem is if only part of the records are updated, the migration will not be finished and you will have to edit an existing migration to make it work again.

rafaelfranca · 2017-11-07T17:05:47Z

@tenderlove @matthewd @jeremy thoughts?

pedantic-git · 2017-11-07T17:24:45Z

@rafaelfranca That's an interesting point, although of course in my common use case it's fine because this is primarily used for populating a column that has been created in the same migration (so a rollback will just erase the whole column).

Is there a more conventional way to do data migrations? As far as I knew, there's only one kind of migration available in Active Record so I use it for both schema and data.

rafaelfranca · 2017-11-07T18:37:17Z

Yeah, right now active record don't have any other way to do it. My problem is not with doing it in the same migration as the structure change is being made. At Shopify for example we wrote a new framework to do data migrations to avoid this kind of problem that I mentioned but before this new framework we avoided to do data migrations in the same migration as a structure migration.

jeremy · 2017-11-07T18:44:13Z

IMO the reversible block is an appropriate level of ceremony to capture this data migration. It's clear what's happening and why. Introducing a populate method encourages data migrations without a complete story to substantiate it. Alternatively, we could introduce a shortcut "on-up-only" method to simplify the reversible block, e.g. up { … } and down { … } that do def up(&b) reversible { |dir| dir.up(&b) } end.

pedantic-git · 2017-11-07T20:54:49Z

I'm certainly happy to call it up. I just felt like an up-only reversible is way too much boilerplate when you're just doing a one-line data migration.

matthewd · 2017-11-07T23:16:31Z

I fully support doing data migrations in migrations -- if you're rearranging tables & columns that currently have data in them, it is The Right Thing for you to carry their existing data with them as needed, and to correspondingly back-fill new fields, such that they reflect the most reasonable approximation of the state the DB would end up in if the user performed those same operations against the post-migration version of the app. (Or, semi-equivalently, such that they behave in the way most consistent with their pre-migration form -- as in the published: true example.)

To my view, the fact that your entire migration will be reverted if something goes wrong is a feature not a problem: unless you have an inordinate amount of data [in which case you probably have a more complex migration story anyway], for the average app, it's ideal that the schema and data are never out of sync.

For me the danger is in any encouragement for people to use migrations to seed, which is very different from back-filling a column, but also sounds a lot like "populate". (The distinction is in whether they need to be performed on an empty database, as you'll get from db:schema:load.)

I do agree that the full reversible block is quite a mouthful for a seemingly simple concept: it's strictly true that it's a reversible operation and the down just happens to be a no-op, but that's not how any human would describe it.

I like @jeremy's suggestion of an up (or even up_only?)... though I do wonder 1) if we introduce both up and down, will people use those even when they have both?, and 2) would that be a bad thing?

pedantic-git · 2017-11-08T09:57:57Z

@matthewd Thanks for saying this! I was going to say the same thing - I want the whole thing to roll back if it fails, so putting the schema and data in the same migration is essential for me.

It seems like people would prefer this was called up rather than populate? I'm not sure there's a use case for a separate down method.

I don't think we can use the name up without some metaprogramming because it's already the name of the equivalent of #change that only runs on the way up. Am I wrong? Is there another similar name we can use?

yskkin · 2017-11-10T02:42:29Z

activerecord/lib/active_record/migration.rb

+    #      def change
+    #        add_column :posts, :published, :boolean, default: false
+    #        populate do
+    #          Post.update_all(published: true)


I think using model directly in migration is fragile since future modification on validation or callback may break this.
If a succeeding migration do drop_table :posts and app/models/post.rb is deleted, Post even does not exist.

How about defining model in place?

class AddPublishedToPosts < ActiveRecord::Migration[5.3] class Post < ActiveRecord::Base; end def change ....

That's a good point. Maybe the example and test should use execute instead? That could be easily done with:

execute "update posts set published = 'true'"

pedantic-git · 2017-11-13T11:09:58Z

Thanks for all your comments so far! Following the feedback above, I've renamed the method to #up_only, and changed the example and test to use #execute.

rafaelfranca

Can you add a CHANGELOG entry?

pedantic-git · 2017-11-14T09:51:58Z

@rafaelfranca Sure! Done.

aruprakshit · 2019-11-15T20:14:30Z

@rafaelfranca Why don't we have up_only in the https://api.rubyonrails.org/ doc page? :)

bogdanvlviv · 2019-11-15T22:26:21Z

@aruprakshit We have https://api.rubyonrails.org/classes/ActiveRecord/Migration.html#method-i-up_only

aruprakshit · 2019-11-16T06:02:51Z

@bogdanvlviv I see it now, but yesterday my look up was now showing it there. Thanks.

bf4 · 2020-01-20T04:10:25Z

activerecord/CHANGELOG.md

@@ -1,3 +1,8 @@
+*   Add `#only_up` to database migrations for code that is only relevant when


docs are wrong. function is named up_only not only_up cc @rafaelfranca

via https://guides.rubyonrails.org/5_2_release_notes.html

The method was briefly called #only_up. This line has already been corrected in the latest https://github.com/rails/rails/blob/5-2-stable/activerecord/CHANGELOG.md

@pedantic-git ah, thanks. I guess the release notes are out of date is all

@bf4 The release notes you link to also call it up_only, as far as I can tell?

¯\_(ツ)_/¯ 👿 thanks

Add a #populate method to migrations

bcca8cd

rails-bot assigned sgrif Nov 7, 2017

Address rubocop issues

38851bc

yskkin reviewed Nov 10, 2017

View reviewed changes

Rename to #up_only and use #execute in the examples intead of the model

bd4eaf8

rafaelfranca reviewed Nov 13, 2017

View reviewed changes

pedantic-git and others added 2 commits November 14, 2017 09:52

Update CHANGELOG

df7924b

Merge branch 'master' into populate_migrations

7b57d6c

rafaelfranca merged commit df82237 into rails:master Nov 14, 2017

bogdanvlviv mentioned this pull request Nov 14, 2017

Fix migration version in doc of #up_only #31154

Merged

aruprakshit unassigned sgrif Nov 15, 2019

bf4 reviewed Jan 20, 2020

View reviewed changes

		@@ -1,3 +1,8 @@
		* Add `#only_up` to database migrations for code that is only relevant when

Add a #populate method to migrations #31082

Add a #populate method to migrations #31082

Uh oh!

Conversation

pedantic-git commented Nov 7, 2017

Uh oh!

rails-bot commented Nov 7, 2017

Uh oh!

pedantic-git commented Nov 7, 2017

Uh oh!

rafaelfranca commented Nov 7, 2017

Uh oh!

rafaelfranca commented Nov 7, 2017

Uh oh!

pedantic-git commented Nov 7, 2017

Uh oh!

rafaelfranca commented Nov 7, 2017

Uh oh!

jeremy commented Nov 7, 2017

Uh oh!

pedantic-git commented Nov 7, 2017

Uh oh!

matthewd commented Nov 7, 2017

Uh oh!

pedantic-git commented Nov 8, 2017

Uh oh!

yskkin Nov 10, 2017

Choose a reason for hiding this comment

Uh oh!

pedantic-git Nov 10, 2017

Choose a reason for hiding this comment

Uh oh!

yskkin Nov 11, 2017

Choose a reason for hiding this comment

Uh oh!

pedantic-git commented Nov 13, 2017

Uh oh!

rafaelfranca left a comment

Choose a reason for hiding this comment

Uh oh!

pedantic-git commented Nov 14, 2017

Uh oh!

aruprakshit commented Nov 15, 2019

Uh oh!

bogdanvlviv commented Nov 15, 2019

Uh oh!

aruprakshit commented Nov 16, 2019

Uh oh!

bf4 Jan 20, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pedantic-git Jan 21, 2020

Choose a reason for hiding this comment

Uh oh!

bf4 Jan 21, 2020

Choose a reason for hiding this comment

Uh oh!

pedantic-git Jan 21, 2020

Choose a reason for hiding this comment

Uh oh!

bf4 Jan 21, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bf4 Jan 20, 2020 •

edited

Loading