Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flav/int to bigint migration #9065

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Conversation

flvndvd
Copy link
Contributor

@flvndvd flvndvd commented Dec 2, 2024

Integer to BigInt Migration Tool

See migration details here.

This PR introduces a PostgreSQL migration tool designed to safely convert integer primary/foreign keys to BigInt while maintaining zero-downtime capability and full rollback support.

Key Features

  • Zero-downtime migration: Uses concurrent index creation and maintains dual columns during transition
  • Phased approach: Splits migration into distinct steps (setup, backfill, pre-cutover, cutover, rollback)
  • Data integrity: Maintains referential integrity throughout the process using triggers
  • Rollback support: Full rollback capability at any stage
  • Safety measures:
    • Dry-run mode for validation
    • Progress tracking
    • Transaction safety
    • Concurrent index creation
    • NOT VALID + VALIDATE pattern for constraints

Key Implementation Details

Constraint Handling

  1. Standardized Constraint Behavior

    • All foreign key constraints are enforced to use ON UPDATE RESTRICT ON DELETE RESTRICT
    • This applies to both new constraints and rollback scenarios
    • Even when rolling back, we don't restore the original CASCADE behaviors
    • This is an intentional design decision to maintain consistent referential integrity rules
  2. Primary Key Dependencies

    • PostgreSQL requires dropping foreign key constraints before modifying primary key constraints
    • This is because foreign keys create dependencies that prevent primary key modifications
    • The original foreign keys will get restored when using the rollback step

Rollback Considerations

  1. Constraint Behavior Changes

    • Original: Mixed ON UPDATE/DELETE behaviors (CASCADE/RESTRICT)
    • After rollback: Standardized to RESTRICT for both
    • This is the only permanent change after rollback
    • Assuming we properly delete each resources individually and don't rely on the CASCADE effect.
  2. Data Integrity During Rollback

    • Maintains both integer and BigInt columns
    • Preserves data synchronization via triggers
    • Ensures zero data loss during rollback

Migration Steps

1. Setup

  • Adds new BigInt columns (_new suffix)
  • Creates sync triggers to maintain data consistency (id -> id_new)
  • Preserves existing constraints

2. Backfill

  • Copies data from integer to BigInt columns
  • Batched processing to minimize impact
  • Progress tracking with percentage completion

3. Pre-cutover

  • Creates necessary indexes concurrently
  • Sets up constraints for new columns
  • Validates data consistency
  • Don't require any transaction

4. Cutover

  • Switches primary key to BigInt column
  • Updates foreign key references
  • Maintains rollback capability via legacy columns

5. Rollback (if needed)

  • Restores integer primary keys
  • Rebuilds original foreign key relationships
  • Maintains data sync for safety

Usage Example

// Setup phase
npx tsx migration.ts --database=postgresql://... --table=users --step=setup

// Backfill phase
npx tsx migration.ts --database=postgresql://... --table=users --step=backfill --batchSize=5000

// Pre-cutover phase
npx tsx migration.ts --database=postgresql://... --table=users --step=pre-cutover

// Cutover phase
npx tsx migration.ts --database=postgresql://... --table=users --step=cutover

// Rollback (if needed)
npx tsx migration.ts --database=postgresql://... --table=users --step=rollback


## Risk

<!-- Discuss potential risks and how they will be mitigated. Consider the impact and whether the changes are safe to rollback. -->

## Deploy Plan

<!-- Outline the deployment steps. Specify the order of operations and any considerations that should be made before, during, and after deployment/ -->

// prevents silent precision loss when handling large integers from the database.
// Throws an assertion error if a BIGINT value exceeds JavaScript's safe integer
// limits.
types.setTypeParser(builtins.INT8, function (val) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will extract to a dedicated PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shipped here.

Copy link
Contributor

@spolu spolu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Very nice \o/

I presume final clean-up is still missing.

Let's start with connectors 👍

return progress;
}

private async checkTableProgress(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this fly on a very big table?

}
}

private async dropTriggers(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should be moved below the main functions

@flvndvd flvndvd marked this pull request as ready for review December 3, 2024 08:21
Copy link
Contributor

@JulesBelveze JulesBelveze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very solid 🔥

Comment on lines +91 to +92
trigger: `${tableName}_pk_sync_${direction}` as const,
function: `${tableName}_pk_sync_${direction}_trigger` as const,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be?

Suggested change
trigger: `${tableName}_pk_sync_${direction}` as const,
function: `${tableName}_pk_sync_${direction}_trigger` as const,
trigger: `${tableName}_pk_sync_${direction}_trigger` as const,
function: `${tableName}_pk_sync_${direction}` as const,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's to surface that the function is only used by a trigger.

currentColumn: MAIN_ID_COLUMN,
newColumn: createColumnName.new(MAIN_ID_COLUMN),
legacyColumn: createColumnName.legacy(MAIN_ID_COLUMN),
sequenceName: `${this.config.tableName}_id_seq`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sequenceName: `${this.config.tableName}_id_seq`,
sequenceName: createSequenceName(this.config.tableName),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

currentColumn: MAIN_ID_COLUMN,
newColumn: createColumnName.new(MAIN_ID_COLUMN),
legacyColumn: createColumnName.legacy(MAIN_ID_COLUMN),
sequenceName: `${this.config.tableName}_id_seq`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
sequenceName: `${this.config.tableName}_id_seq`,
sequenceName: createSequenceName(this.config.tableName),

flvndvd and others added 16 commits December 3, 2024 12:18
…ions for update and delete

 - Added fields to track delete and update actions on foreign key constraints in migration script
 - The migration script now includes additional SQL JOIN to fetch action information from the pg_constraint table
…igration

 - Change foreign key `ON UPDATE` and `ON DELETE` actions to dynamic based on reference settings
 - Set actions to `SET NULL` when reference actions are marked with "r", otherwise keep as `RESTRICT`
…aints

 - Fixed an issue where the update and delete actions for foreign key constraints were incorrectly set to RESTRICT when they should be SET NULL
 - Updated the conditional logic to match 'n' values to SET NULL action instead of 'r' in the migration script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants