Skip to content

How to migrate the database

Stephen Chudleigh edited this page Jul 29, 2024 · 18 revisions

How to Migrate the Database

Why does it matter?

Database migration, or changing the schema, always carries a certain amount of risk in a production system. This document is intended to highlight the specific risks of changing a database used by two different applications.

Before you begin

Determine if the schema changes have impacts on both applications. If the changes only impact data for a single application, you do not need to coordinate code changes and deployments on both applications. You should still follow the practices for zero-downtime deployments below, but deployment and release management will be easier.

Architectural Evolution

At the time of writing, the Challenge portal is built on the Elixir/Phoenix stack. The end state of the application will be to retire the Phoenix app and run everything on Rails. Until that is achieved, there will be an interim period where two applications work in tandem to deliver the full set of features.

Two Applications

  1. Challenge Portal (Manage Challenges, Receive Submissions)
  2. Challenge Platform (Manage Evaluations, Evaluate Submissions)

Zero downtime deployment

challenge.gov aims to achieve zero downtime for regular operations such as data migration. This imposes certain restrictions on what is possible and how to accomplish the desired schema changes.

Both Applications

Specifically, any changes that are destructive of existing data are not allowed until all applications have been updated to remove their dependence on that data.

List of Destructive Operations

  1. Dropping a table
  2. Removing a column
  3. Renaming a table
  4. Renaming a column
  5. Changing the data type of a column

List of Non-destructive Operations

  1. Creating a new table
  2. Adding a new column

Destructive Migrations Operations

Before you destroy (data)

For all destructive operations, consider the following:

  1. API stability - consider whether older clients, whether browser- or mobile-based, might not be updated at the same time as the deployed backend server, and handle any related errors appropriately.
  2. Backup processes - some scripts or backup systems might depend on specific tables or columns being present
  3. External systems - are there any external systems that might export or import the data?
  4. Historical data - If the table could be useful for understanding the system in the future, maybe it should be retained as long as it does not add significant storage costs.
    • Provide documentation about the decision to retain historical data and the context of its usage when it was retired.

Dropping a table

In order to drop a table, first ensure that the table is not being used or referenced anywhere in the application(s). If the data is being used, first deploy changes that remove those dependencies. If the data is required but no longer needs to be updated or used in any SQL queries, it might be converted into a static file format such as JSON or stored in some other document storage solution. In the case of APIs that must be maintained for backwards compatibility on older clients, this static data can be used in lieu of the table reference.

Once that is complete, you can drop the table. Create a new migration on the Rails app and run it on the next deployment. No updates are needed on the Phoenix side.

Removing a column

In order to remove a column, first ensure that the column is not being used or referenced anywhere in the application(s). Do not rely on tests; perform a full text search of the codebases for the column name, schema module, etc. If the data is being used, first deploy changes that remove those dependencies. A default value representing empty data can be added to legacy APIs to maintain backwards compatibility.

Once that is complete, you can remove the column. Create a new migration on the Rails app and run it on the next deployment. No updates are needed on the Phoenix side.

Renaming a table

Although the SQL to accomplish this is simple, the impact on the codebase is not. Logically the effect is dropping and adding tables. First decide if the data must be kept in sync in real-time or if it can be backfilled later. If it's necessary to keep in sync:

  1. Create a copy of the existing table and perform dual-writes in both applications to keep them in sync
    1. Create a new table migration in the Rails app and run it on the next deployment
      1. Copy the previous table data to the new one (or create the table as a copy of the original)
    2. Deploy updates to both apps that write to both tables for any data changes
    3. Backfill any data that changed between the time you copied the original table into the new table and the time that all changes were dual-written to both tables in both applications.
  2. Update both applications to use the new table as the primary table for models and schemas. Deploy both applications to production.
  3. Deploy changes to stop dual-writing in both applications. The older table will be frozen for writes at this point (no further updates).
  4. Create a migration in the Rails app that drops the unused table.

Renaming a column

The strategy is similar to renaming a table. Create a copy of the column with the new name. write to both columns; update the applications to only use the new column; backfill missing data; stop dual-writing; lastly, drop the older column. Alias the old column to the new one to maintain backwards compatibility for API clients and external systems support if needed.

Change the data type of a column

Changing data types requires additional planning depending on the old and new types and whether the conversion results in potential data loss, e.g. changing float to int or reducing the size of the field. Consider the tradeoffs and plan accordingly. Consult with the technical team lead to create a plan before attempting this type of migration. UI inputs should be updated to reflect the new types. Users may need to be notified if there is potential data loss from the migration.

Non-destructive Migration Operations

New data can be added more easily than changes that destroy or modify existing data. For new data, create the migration in Rails. In most cases new data will be tied to new features (Rails app only) and therefore no changes required for Phoenix. Migrations are run as part of the normal deployment process before code is updated, so that you can be assured that the schema will exist when the app code is updated. To avoid any risk of bad migration causing production issues, you can separate the deployment of migration changes from the deployment of code changes that reference that migration.

Phoenix-only Changes (production updates)

During the course of Rails development, it might be necessary to update something in the Phoenix app that requires a schema change in production before the Rails app is deployed. In that case you can create the migration on the phoenix app with mix ecto.gen.migration and deploy it through to production. After it is in production, update the db/structure.sql in Rails to reflect the new schema so that it is in sync with Phoenix production environment. The development branch of Phoenix should be kept up to date with production in the normal development flow so schema changes will be applied downstream to the feature branch.