In a world where we have a Car
model that historically could be owned by a User
, we want to change this relationship to allow a Company
to own a Car
too.
The assumed shape of the database would be like this:
erDiagram
Car {
int id PK int user_id FK "users.id" }
User {
int id PK}
Company {
int id PK}
Car }o--|| User : "Car belongs_to User"
and the Rails models would look nice and simple like this:
class Car < ApplicationRecord
belongs_to :user
end
class User < ApplicationRecord
has_many :cars
end
class Company < ApplicationRecord
end
Where we want to get to
We’re wanting the owner of the Car
to be either a User
or a Company
. For this we have identified that a polymorphic relationship should be introduced.
Our application is widely used and is deployed with “rolling” deployments.
If our application was small scale we could rename the the user_id
column on the Car
to owner_id
. Add the owner_type
string column and backfill it with User
. We could then start changing all the usage to set the owner
rather than the user
. We could probably do this as a big bang PR.
In a scaled application with rolling deployments, it’s not quite as easy as this. We require the Application to be functional at all times, while backfilling and switching implementation at the correct times. This will involve several steps, split across PRs and deployments. This post aims to set out what should change and in what order they should be deployed.
The end goal is to get the database to look like this:
erDiagram
Car { int id PK int owner_id "users.id or companies.id" int owner_type "User or Company"
}
User { int id PK } Company { int id PK } Car }o--|| User : "Car owned by a User"
Car }o--|| Company : "Car owned by a Company"
and the Rails models to look like this:
class Car < ApplicationRecord
belongs_to :owner, polymorphic: true
end
class User < ApplicationRecord
has_many :cars, as: :owner
end
class Company < ApplicationRecord
has_many :cars, as: :owner
end
Steps to take
Adding the reference columns
First of all we’re going to add our new reference (owner) columns to our cars
table.
We’ll follow the best practices set out by the Strong Migrations Gem
We have to make the reference null: true
as we’ll be backfilling data in a later step.
Generate a migration from the command line
rails g migration AddOwnerToCars owner:references{polymorphic}
and modify the migration, adding disable_ddl_transaction!
, null: true
and index: {algorithm: :concurrently}
.
class AddOwnerToCars < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
def change
add_reference :cars, :owner, polymorphic: true, null: true, index: {algorithm: :concurrently}
end
end
With no other changes we can ==create a pull request==, ==merge and deploy== this change. These columns will be added to our database with no values and will not require any values to be added. The application should continue working as it was previously.
The database schema will look like this:
erDiagram
Car { int id PK int user_id FK "users.id" int owner_id "users.id or companies.id" int owner_type "User or Company"
}
Double write
To start with we will want to “double write” the Car
’s user
and owner
. When we set the user
value of the Car
we also want it to set the owner
.
This will mean that any Car
that we create or update from now on will have an owner
as well as a user
.
To achieve this we can add a after_save
hook on the Car
model.
It’s worth checking that there is no code which directly creates or updates cars
in the database, otherwise this hook will not be fired. If you have SQL based changes, they will need modifying to write to the owner_id
and owner_type
when writing to the user_id
.
It’s worth thoroughly checking that both user
and owner
fields are being persisted to the database, your test suite should be able to help.
class Car < ApplicationRecord
belongs_to :user
belongs_to :owner, polymorphic: true, optional: true
after_save do
self.owner = user
end
end
For this change we can ==create a pull request==, ==merge and deploy==.
Backfill untouched data
Now that newly created and updated records have the owner
set on them, we can backfill any records that do not have data in either the owner_id
or owner_type
columns.
It’d be recommended to use a gem like the Data Migrate gem to perform the backfill. This will ensure the data is backfilled in all environments - this is a safer approach than running a script in each environment.
Generate a new data migration:
rails g data_migration backfill_owner_on_cars
Ensure your backfill is as performant as possible. For this example we choose to combine an update_all
with the in_batches
method. This makes sure we are effectively updating records at the database level without instantiating Active Record models.
class BackfillOwnerOnCars < ActiveRecord::Migration[7.1]
def up
Car.where(owner_id: nil, owner_type: nil).in_batches do |batch_cars|
batch_cars.update_all("owner_id = user_id, owner_type = 'User'")
end
end
end
If you’re not using a gem like Data Migrate you may want to ==create a pull request==, ==merge and deploy== at this point. If you are using a gem that perseveres the migration order between data and schema changes you can skip this deployment.
Make new columns not null
Now that all of our older data has been filled and new data is being populated with an owner
we can update our database to ensure records can not be inserted without the owner_id
and owner_type
.
We’ll follow the Strong Migrations guidance for setting NOT NULL
on a column. This is a 2 part migration, but both migrations can be committed and run on a single deployment - the migration files will have to be separate.
Generate the first migration to add a check_constraint
to the columns
rail g migration SetCarsOwnerIdAndOwnerTypeNotNull
We’ll update the migration to add a check constraint to both columns
class SetCarsOwnerIdAndOwnerTypeNotNull < ActiveRecord::Migration[7.1]
def change
add_check_constraint :cars, "owner_id IS NOT NULL", name: "cars_owner_id_null", validate: false
add_check_constraint :cars, "owner_type IS NOT NULL", name: "cars_owner_type_null", validate: false
end
end
The second part to this step will validate the check constraint, update the columns to NOT NULL
and then remove the check constraint.
rail g migration ValidateCarsOwnerIdAndOwnerTypeNotNull
The updated migration should look like
class ValidateCarsOwnerIdAndOwnerTypeNotNull < ActiveRecord::Migration[7.1]
def change
validate_check_constraint :cars, name: "cars_owner_id_null" # name from previous migration
change_column_null :cars, :owner_id, false
remove_check_constraint :cars, name: "cars_owner_id_null" # name from previous migration
validate_check_constraint :cars, name: "cars_owner_type_null" # name from previous migration
change_column_null :cars, :owner_type, false
remove_check_constraint :cars, name: "cars_owner_type_null" # name from previous migration
end
end
We’ve made several changes to the database and if we have any null values in the database at this point, the second of the two migrations will fail. This is a good point to ==create a pull request==, ==merge and deploy==.
Use the new relationship
Hopefully at this stage we’ve not hit any issues or snags, our application is still up and running with the appropriate data being populated in our Cars table.
We can now update our Car model and probably our test suite will require the most changes.
We’ll change the relationships on the Car incrementally. First of all we’ll remove the optional: true
portion of the owner
relationship
class Car < ApplicationRecord
belongs_to :user
belongs_to :owner, polymorphic: true
after_save do
self.owner = user
end
end
This change hopefully doesn’t cause any issues as the database changes would have highlighted any previously.
Our next step is to update all the places where the user
is set and change that to use owner
. We could do this by overriding the #user=
method to look like.
class Car < ApplicationRecord
belongs_to :user
belongs_to :owner, polymorphic: true
after_save do
self.owner = user
end
def user=(value)
super(value)
self.owner = value
end
end
This will solve most of the cases where the code set the car.user = User.find(1)
as it’ll set both the user_id
and the owner_id
, owner_type
pair.
Unfortunately a lot of other uses will no doubt exist, mainly situations where we create Cars using user.cars.create
. We can update the User record in this instance to use this new relationship
class User < ApplicationRecord
has_many :cars, as: :owner
end
at this point we have probably covered most cases. Your test suite will come to the rescue here and every failure will take you to another area of the code. Test factories or fixtures will need updated to create the correct looking records.
There is an easy albeit nuclear, way to test what will break and that is to add user_id
to the ignored columns on the Car model.
This is useful to find areas that may break but it wouldn’t be advisable to commit that change yet - by all means fix the issues that it raises incrementally.
This process may take several PRs over a period of time, depending on your priorities.
Remove the old relationship
We should now be at the stage where there is no code reliance on the user
relationship on the Cars model.
We can now remove the relationships and any additional code that we’ve added to keep everything aligned.
class Car < ApplicationRecord
belongs_to :owner, polymorphic: true
after_save do
self.user_id = owner_id
end
end
We have also switched the after_save
callback to set the user_id
based off of the owner_id
- this will keep the user_id
column being filled until the next step. This is a good point to ==create a pull request==, ==merge and deploy== to ensure nothing has been missed.
Removing the user_id
column
This is another multistep process where we’ll have to update the user_id
column to be nullable, ignore the column, drop the column and then finally remove the ignored column. Each step will require us to ==create a pull request==, ==merge and deploy==.
Allowing null on user_id
This is a fairly straightforward migration
rail g migration SetCarsUserIdNotNull
class SetCarsUserIdNotNull < ActiveRecord::Migration[7.1]
def change
change_column_null :cars, :user_id, true
end
end
We will want to ==create a pull request==, ==merge and deploy== at this point.
Ignoring the user_id
column
Now that the database can accept null values we can remove the after_save
callback in the Car model.
class Car < ApplicationRecord
belongs_to :owner, polymorphic: true
end
While we’re in this file we can also add user_id
to the ignored columns
class Car < ApplicationRecord
self.ignored_columns += [:user_id]
belongs_to :owner, polymorphic: true
end
After this change we will ==create a pull request==, ==merge and deploy==. This allows the Rails app to start ignoring the column before it is actually removed.
Drop the user_id
column
It has been a long journey and now we’re at the point where we can drop the older user_id
column since none of our application uses it anymore.
rails g migration RemoveUserIdFromCars
with the migration looking like.
class RemoveUserIdFromCars < ActiveRecord::Migration[7.1]
def change
# Safety assured by adding the `user_id` to the
# `ignored_columns` on the Car model
safety_assured { remove_column :cars, :user_id }
end
end
It’s good practice to add a note when using the safety_assured
helper from strong migrations to help others in your team understand how safety has been assured around dangerous operations.
We need to ==create a pull request==, ==merge and deploy== before moving on to the last step.
Remove ignored columns
At last we have reached the final step in this process and it’s a simple clean up step.
We’ll remove the self.ignored_columns += [:user_id]
from the Car model.
The final Car model will look like this
class Car < ApplicationRecord
belongs_to :owner, polymorphic: true
end
Our User model will not change from our current version but should look like this
class User < ApplicationRecord
has_many :cars, as: :owner
end
and finally we can update our Company model to use the polymorphic relationship like so
class Company < ApplicationRecord
has_many :cars, as: :owner
end
Ship it
And with one last ==pull request==, ==merge and deploy== we’ve shipped our new feature.
From what seems like a small change in the code, we can see that this is a very involved process involving many moving parts.
To reiterate, this process probably isn’t necessary when dealing with a small application, but at scale, these steps are crutial to keep your application functional and performant.