Schema Migration
Also known as: database migration, migrations
The disciplined, versioned process of changing a database's structure — adding columns, tables, or indexes — over time, in step with the application, without losing data or downtime.
- Primary domain
- Information Systems
- Sub-category
- Database Management & Information Storage
In simple terms
A schema migration is a controlled change to the structure of a database — adding a table, adding or renaming a column, creating an index — applied in a repeatable, versioned way. As an application evolves, its database has to evolve with it, but the database is already full of real data you can’t lose. Migrations are how teams change a live schema safely: each change is a small, ordered, reviewable script that can be applied to every environment (a developer’s laptop, staging, production) and reproduced exactly.
The Visual Map
graph LR
subgraph Versions["Migration versions (git-tracked)"]
V1["0001_create_users.sql"]
V2["0002_add_email.sql"]
V3["0003_add_index.sql"]
V4["0004_rename_column.sql"]
end
subgraph Tool["Migration tool"]
Track["schema_migrations table\n(applied versions)"]
Check["Check: which not yet applied?"]
Apply["Apply missing in order"]
end
subgraph DB["Database state"]
S0["(empty)"]
S1["users table"]
S2["+ email column"]
S3["+ index"]
S4["column renamed"]
end
V1 --> Apply --> S1
V2 --> Apply --> S2
V3 --> Apply --> S3
V4 --> Apply --> S4
Track --> Check --> Apply
note["Expand/contract pattern for rename:\n1. Add new column → 2. Write to both → 3. Backfill\n→ 4. Switch reads → 5. Drop old column"]
More detail
The core idea is to treat the schema like code: every change is a numbered, version-controlled migration that moves the database from one known state to the next. A migration tool keeps a record (a table like schema_migrations or flyway_schema_history) of which migrations have already run, so it knows exactly which to apply next.
Key migration practices:
- Forward-only, ordered scripts — migrations run in sequence; the tool tracks the current version. Each migration runs exactly once. Rolling back is usually rolling forward with a corrective migration, not
down. - Additive, backward-compatible changes — to deploy without downtime, make changes the old code can still tolerate:
- ✅ Add a nullable column (old code ignores it)
- ✅ Add a new table (old code doesn’t use it)
- ✅ Add an index (
CONCURRENTLYin PostgreSQL avoids locking) - ⚠️ Add a NOT NULL column without a default (blocks inserts from old code)
- ⚠️ Rename a column (old code uses the old name — it will break)
- ❌ Drop a column while old code still references it
- Backfills — populating a new column for existing rows, typically as a separate batched job that updates rows in small chunks to avoid long table locks.
- Expand/contract pattern for zero-downtime renames: add the new column (expand) → write to both old and new → backfill → switch reads to new → drop old (contract). Each step is a separate deploy.
Common migration tools:
| Tool | Ecosystem | Style |
|---|---|---|
| Django migrations | Python/Django | Auto-generated from model diff |
| Alembic | Python/SQLAlchemy | Auto + manual SQL |
| Rails Active Record | Ruby/Rails | Auto-generated |
| Flyway | Java / any | SQL files numbered by version |
| Liquibase | Java / any | XML/YAML/SQL changeset format |
| Prisma Migrate | TypeScript | Schema DSL → generated SQL |
| golang-migrate | Go | SQL files |
| Sqitch | Any | SQL + deploy/revert/verify scripts |
Dangerous ALTER TABLE operations:
In PostgreSQL, ALTER TABLE ADD COLUMN with a DEFAULT requires rewriting every row (pre-PG 11); PG 11+ stores the default in metadata and avoids the rewrite. ALTER TABLE ADD COLUMN NOT NULL without a default is still dangerous — it requires scanning the table to verify no nulls. RENAME COLUMN is metadata-only but breaks running application code that uses the old name.
Under the Hood
A minimal Python migration runner with schema_migrations table tracking:
#!/usr/bin/env python3
"""Mini migration runner: versioned SQL files applied once, in order."""
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute('''CREATE TABLE IF NOT EXISTS schema_migrations (
version TEXT PRIMARY KEY,
applied_at TEXT DEFAULT (datetime('now'))
)''')
conn.commit()
# Simulated migration history (SQL statements per version)
MIGRATIONS = [
("0001_create_users", '''
CREATE TABLE users (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
created_at TEXT DEFAULT (datetime('now'))
)
'''),
("0002_add_email", '''
ALTER TABLE users ADD COLUMN email TEXT
'''),
("0003_add_index", '''
CREATE INDEX IF NOT EXISTS idx_email ON users(email)
'''),
("0004_add_active_flag", '''
ALTER TABLE users ADD COLUMN active INTEGER DEFAULT 1
'''),
]
def get_applied():
return {row[0] for row in conn.execute("SELECT version FROM schema_migrations")}
def migrate():
applied = get_applied()
pending = [(v, sql) for v, sql in MIGRATIONS if v not in applied]
if not pending:
print("All migrations already applied.")
return
for version, sql in pending:
try:
conn.execute(sql)
conn.execute("INSERT INTO schema_migrations (version) VALUES (?)", (version,))
conn.commit()
print(f" Applied: {version}")
except Exception as e:
conn.rollback()
print(f" FAILED: {version} — {e}")
break
print("First run (from scratch):")
migrate()
print("\nSecond run (idempotent — nothing to do):")
migrate()
# Show current schema state
print("\nCurrent tables:")
for row in conn.execute("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name"):
print(f" {row[0]}")
print("\nApplied migrations:")
for row in conn.execute("SELECT version, applied_at FROM schema_migrations ORDER BY version"):
print(f" {row[0]} ({row[1]})")
# Show the resulting schema
print("\nusers table columns:")
for row in conn.execute("PRAGMA table_info(users)"):
print(f" col={row[1]} type={row[2]} notnull={row[3]} default={row[4]}")
conn.close()
Engineering Trade-offs
Migration atomicity vs. long-running statements
Each migration should be a transaction so a failure is safe to retry. However, some DDL operations can’t run in a transaction (MySQL DDL is auto-committed; some PostgreSQL operations like CREATE INDEX CONCURRENTLY can’t run in a transaction). For these, the migration must be idempotent — designed to be run safely more than once (CREATE TABLE IF NOT EXISTS, CREATE INDEX IF NOT EXISTS).
Zero-downtime vs. migration complexity A single-step column rename is simple but requires downtime (old code breaks immediately). A zero-downtime rename takes 4–5 steps spread across multiple deploys. The question is always: is the downtime window acceptable, or does the business require continuous availability? For most SaaS products, the zero-downtime path is required — which multiplies migration complexity.
ALTER TABLE locking
ALTER TABLE ADD COLUMN in PostgreSQL 11+ (without a NOT NULL constraint and without a volatile default) is fast (metadata-only). But ALTER TABLE ADD COLUMN NOT NULL DEFAULT now() (volatile default) rewrites every row, taking an ACCESS EXCLUSIVE lock for the duration — blocking all reads and writes. The safe pattern: add the column as nullable (fast), backfill in batches, then add the NOT NULL constraint with NOT VALID and validate separately. ADD CONSTRAINT ... NOT VALID + VALIDATE CONSTRAINT splits the lock.
Backfill safety vs. throughput Backfilling a new column across 100M rows: doing it in one UPDATE locks the table. Batched backfills (1000 rows per transaction with a brief sleep between batches) are safe but slow — may take hours. For very large tables, backfills run in a background job during off-peak hours with monitoring; the migration that creates the column lands first, the backfill runs separately, and the NOT NULL constraint is added after all rows have values.
Rollback vs. roll-forward
Most migration tools support a down rollback. In practice, rolling back a production database is risky: the down migration may not be tested, may destroy data, and may not be possible for irreversible changes (dropping a column, truncating). The safer approach is roll-forward: write a new migration that corrects the bad one. This is also why additive changes (never drop until the next deploy) are preferred in production migrations.
Real-world examples
- Stripe’s continuous migration process — Stripe publishes articles on their database migration discipline: each migration lands in a separate deploy from the code that uses the new schema, using expand/contract for all breaking changes. They use custom tooling that enforces backward-compatibility analysis before a migration is approved.
- GitHub’s
gh-ostfor MySQL —gh-ost(GitHub Online Schema Toolkit) performs online schema changes on large MySQL tables without locking: it creates a shadow table, applies changes to the shadow, uses binlog replication to keep it in sync, then atomically swaps the tables. Used by GitHub on tables with 1B+ rows. - Django’s
./manage.py migrate— Django auto-generates migrations from model diffs withmakemigrations. The generated migration is a Python file withoperations = [AddField(...), CreateIndex(...)]. Django’s migration framework handles dependency ordering (if Migration B adds a ForeignKey to a table created by Migration A, B depends on A). - Flyway at financial institutions — many financial institutions use Flyway because it is explicit SQL (no magic ORM-generated DDL) and audit-friendly: every migration is a numbered SQL file that a DBA can review. The
flyway_schema_historytable records the checksum of each applied migration — if a migration file is modified after running, Flyway detects the mismatch and refuses to proceed. - Zero-downtime at Shopify — Shopify runs the Rails
strong_migrationsgem, which raises a development error for any migration that would lock a table in production — e.g., adding a column with a non-null default in MySQL < 8. The gem forces developers to write the safe multi-step version.
Common misconceptions
- “A migration is just running an
ALTER TABLE.” The SQL is the easy part; the discipline — ordering, versioning, zero-downtime sequencing, backfills, and atomicity — is what makes it a migration rather than a risky one-off edit. - “Migrations are reversible.” Some are (adding a column is reversible by dropping it), but many are not (dropping a column destroys data; renaming requires multi-step expand/contract). Production practice leans on rolling forward with a corrective migration rather than assuming you can undo.
- “You can
ALTER TABLEa large production table directly.” On a 1-billion-row table,ALTER TABLE ADD COLUMN NOT NULL DEFAULT xmay lock the table for 10+ minutes. Tools likegh-ost(MySQL),pg_repack(PostgreSQL), or theCONCURRENTLYkeyword for index creation exist specifically to handle large-table DDL without locking.
Try it yourself
Run a sequence of versioned migrations with a tracking table:
python3 - << 'EOF'
import sqlite3
conn = sqlite3.connect(':memory:')
conn.execute("CREATE TABLE migrations_log (version TEXT PRIMARY KEY, applied_at TEXT DEFAULT (datetime('now')))")
conn.commit()
migrations = [
("v001", "CREATE TABLE products (id INTEGER PRIMARY KEY, name TEXT, price REAL)"),
("v002", "ALTER TABLE products ADD COLUMN category TEXT"),
("v003", "CREATE INDEX IF NOT EXISTS idx_cat ON products(category)"),
("v004", "ALTER TABLE products ADD COLUMN active INTEGER DEFAULT 1"),
]
applied = {r[0] for r in conn.execute("SELECT version FROM migrations_log")}
for version, sql in migrations:
if version in applied:
print(f"skip {version} (already applied)")
continue
try:
conn.execute(sql)
conn.execute("INSERT INTO migrations_log(version) VALUES (?)", (version,))
conn.commit()
print(f"apply {version}")
except Exception as e:
conn.rollback()
print(f"ERROR {version}: {e}")
break
# Insert test data to verify schema is correct
conn.execute("INSERT INTO products VALUES (1,'Widget',9.99,'tools',1)")
conn.execute("INSERT INTO products VALUES (2,'Gadget',19.99,'electronics',1)")
conn.commit()
print("\nproducts after all migrations:")
for r in conn.execute("SELECT * FROM products"):
print(f" {dict(zip(['id','name','price','category','active'], r))}")
print("\nMigration log:")
for r in conn.execute("SELECT version, applied_at FROM migrations_log"):
print(f" {r[0]} {r[1]}")
conn.close()
EOF
Learn next
- Normalization — schema design principles that migrations implement over time; understanding 3NF explains why migrations tend to add tables (introducing new entities) rather than adding columns to existing ones.
- Transaction ACID — each migration ideally runs inside a transaction; atomicity guarantees that a failed migration leaves the schema in the previous valid state rather than a half-applied intermediate.
- ORM — ORM migration generators (Django
makemigrations, Alembicautogenerate) automate migration creation from model changes; the ORM migration and the schema migration are tightly coupled.
Read this in a learning path
All paths →This topic is part of a learning path. Start in context to keep prev/next and progress tracking.
Relationships
- Requires
- Related
Neighborhood
A visual companion to the relationships above. Click any node to visit that topic.