# Alembic practical guide: safely evolving a SQLAlchemy schema with zero downtime

> Faithful to the Alembic official documentation, this concretely explains, from a production-operation viewpoint, wiring env.py's target_metadata, accurately discerning the changes autogenerate can and can't detect, stabilizing diffs with naming conventions, upgrade/downgrade operation, batch operations, and zero-downtime schema changes via expand/contract.

- Published: 2026-06-24
- Author: 友田 陽大
- Tags: Python, Alembic, SQLAlchemy, PostgreSQL, データベース, マイグレーション, 本番運用
- URL: https://tomodahinata.com/en/blog/alembic-zero-downtime-migrations-sqlalchemy
- Category: Python backend
- Pillar guide: https://tomodahinata.com/en/blog/fastapi-production-async-pydantic-observability-guide

## Key points

- env.py's target_metadata = Base.metadata becomes the entire starting point of autogenerate.
- autogenerate is a draft; grasp the changes it can't detect (renames, anonymous constraints, sequences, etc.) and always review.
- Make constraint names deterministic and stabilize diffs with MetaData(naming_convention=...) (put it in on day one of the project).
- Verify the round trip of upgrade→downgrade→upgrade in CI and reject migrations that can't be rolled back.
- Assemble a multi-stage deploy that doesn't contradict the running app with the expand/contract pattern, achieving zero downtime.

---

## **Introduction: why "manual DDL" definitely breaks down in production**

Changing the schema by hand with `ALTER TABLE` from `psql` — it works in early development. But the moment you enter production operation, this method can no longer answer the following questions.

- "Do the staging and production schemas really match? No one can guarantee it."
- "I want to roll back the `ALTER` I ran 3 weeks ago. How do I revert it?"
- "To make a newly-joined member's local DB the same state as production, what do I run in what order?"

A schema should be **version-controlled, reproducible, and able to go both forward and backward, just like code.** The de facto standard for realizing this in Python is **Alembic,** developed by the author of SQLAlchemy themselves.

But Alembic has a pitfall beginners definitely fall into — **over-reliance on autogenerate.** `alembic revision --autogenerate` is powerful, but even though the official documentation clearly enumerates "**changes it can't detect,**" the operation of merging the generated script without verification produces accidents. A table rename being generated as a "data-losing drop & create" — this is a specification of autogenerate, not a bug.

This article isn't a repeat of an intro. **While being faithful to the official documentation ([alembic.sqlalchemy.org](https://alembic.sqlalchemy.org/en/latest/)), it's one level clearer than it,** aiming to break through the following walls you definitely face in practice.

- "I don't understand what to connect `env.py`'s `target_metadata` to for autogenerate to work."
- "I want to accurately know what autogenerate detects and what it **doesn't detect.**"
- "Constraint names are scattered per DB, and I can't write `DROP CONSTRAINT`."
- "I attached NOT NULL when adding a column and the app crashed mid-deploy. I want to evolve the schema with zero downtime."

The design of the SQLAlchemy model (`Base.metadata`) handled here is detailed in the sister article [SQLAlchemy 2.0 practical guide](/blog/sqlalchemy-2-typed-orm-production-guide). This article focuses on the phase of **safely reflecting that model into the database.**

The author designed and operated the backend of a METI-Minister's-Award-winning B2B SaaS with **Python / SQLAlchemy 2.0 / PostgreSQL 16,** and ran the schema evolution of the production DB with Alembic. This article organizes the knowledge gained from that implementation, with the backing of the official documentation.

---

## **1. Setup: `alembic init` and wiring `env.py`'s `target_metadata`**

Everything in Alembic starts from hitting one command at the project root.

```bash
alembic init alembic
```

Per the official documentation, this generates "an environment using the generic template." The composition created is as follows.

| Path | Role |
| --- | --- |
| `alembic.ini` | the config file. `script_location` (the environment's path) is the only required setting |
| `alembic/env.py` | the Python script run on each migration execution. **This is the heart of the wiring** |
| `alembic/script.py.mako` | the template that generates new migration files |
| `alembic/versions/` | the directory where generated migration scripts accumulate |
| `alembic/README` | the description file |

### **H3: Connect `target_metadata` to `Base.metadata`**

The one and most important wiring to make autogenerate function is `target_metadata` in `env.py`. **By passing your SQLAlchemy model's `MetaData` here, Alembic knows the "schema that should be."**

```python
# alembic/env.py（上部）
from myapp.models import Base  # 自分たちの DeclarativeBase

target_metadata = Base.metadata
```

And in `env.py`'s `run_migrations_online()`, pass this `target_metadata` to `context.configure()`. The official `env.py` roughly takes the following form.

```python
def run_migrations_online():
    engine = engine_from_config(
        config.get_section(config.config_ini_section),
        prefix="sqlalchemy.",
    )

    with engine.connect() as connection:
        context.configure(
            connection=connection,
            target_metadata=target_metadata,
        )

        with context.begin_transaction():
            context.run_migrations()
```

**Why is this superior?**
autogenerate **diff-compares** `target_metadata` (the "shape that should be" the code defines) and the actual schema of the connected DB (the "current state" read by reflection), and generates a migration that fills the difference. That is, if you forget to pass `Base.metadata`, autogenerate detects nothing; conversely, if you connect it correctly, you get a "draft" of the diff script just by changing the model and hitting one command.

> ⚠️ **Handling connection info**: writing the production connection string in plaintext in `alembic.ini`'s `sqlalchemy.url` is forbidden. Read it from `os.environ` in `env.py` and inject it with `config.set_main_option("sqlalchemy.url", ...)`, or separate the connection info itself from the environment with the offline mode described later. Don't leave credentials in the code or the repo — this is the iron rule.

---

## **2. "Trust but verify" autogenerate — a quick reference of can/can't detect**

This is the most misunderstood theme in this article. autogenerate is convenient, but it's **not omnipotent.** The official documentation explicitly classifies the detection capability into 3 stages. The generation command is this.

```bash
alembic revision --autogenerate -m "add user table"
```

The official adds an important warning. **"The generated migrations are candidates and require human review and adjustment before execution (candidates requiring manual review)."** Operation that takes this lightly becomes the source of production accidents.

### **H3: A "detection capability" quick reference faithful to the official**

The table below organizes the official's "What does Autogenerate Detect (and what does it not detect?)" by practical judgment axes.

| Classification | Target | Practical meaning |
| --- | --- | --- |
| ✅ **Always detects** | Adding/removing a table | new entities are generated as-is |
| ✅ **Always detects** | Adding/removing a column | you can trust column increases/decreases |
| ✅ **Always detects** | A change in a column's `nullable` state | NULL-allowed ↔ NOT NULL is detected |
| ✅ **Always detects** | Basic changes to an index or an **explicitly-named** unique constraint | the reason naming conventions work (§3) |
| ✅ **Always detects** | Basic changes to foreign-key constraints | increases/decreases in reference relationships |
| ⚠️ **Optionally detects** | A column's **type change** | detected only after enabling `compare_type=True` |
| ⚠️ **Optionally detects** | A change to a **server default** | detected only with `compare_server_default=True` |
| ❌ **Doesn't detect** | A table-name change (rename) | generated as **adding + removing a different table** |
| ❌ **Doesn't detect** | A column-name change (rename) | likewise becomes a **column add + remove** pair |
| ❌ **Doesn't detect** | An anonymous (unnamed) constraint | without a name, the diff can't be tracked (§3) |
| ❌ **Doesn't detect** | Special types like ENUM (on a DB that doesn't directly support it) | hand-writing needed |
| ❌ **Doesn't detect** | Some standalone constraints (PRIMARY KEY, EXCLUDE, CHECK) | supplement by hand |
| ❌ **Doesn't detect** | Adding/removing a sequence | not implemented (hand-write) |

### **H3: The most dangerous pitfall — a rename becomes a "drop & create"**

Let me quote the official words accurately. About table-name and column-name changes —

> "These are output as the addition/removal of two different tables. You should rewrite them by hand into a rename."

That is, if you just rename the `name` column to `full_name` in the model and trust autogenerate, what's generated is a **destructive** script like this.

```python
def upgrade() -> None:
    # autogenerate が生成する「危険な」差分（そのまま流すとデータ消失）
    op.add_column("users", sa.Column("full_name", sa.String(), nullable=True))
    op.drop_column("users", "name")  # ← name のデータは全消失する
```

Correctly, use `op.alter_column`'s `new_column_name` and rewrite it by hand into **a rename that preserves data.**

```python
def upgrade() -> None:
    op.alter_column("users", "name", new_column_name="full_name")


def downgrade() -> None:
    op.alter_column("users", "full_name", new_column_name="name")
```

**That's exactly why the review of generated scripts should be built into operation as a "discipline."** In the author's team, autogenerate output was always reviewed by a human as a PR diff, and diffs containing `drop_column` / `drop_table` had a special caution flag raised. autogenerate is "a pair programmer who writes a draft," not "an autopilot you trust without verification."

> 💡 **If you want type changes detected, enable it explicitly**: passing `compare_type=True` and `compare_server_default=True` to `context.configure(...)` in `env.py` includes changes in column types and server defaults in the diff too. But since type comparison also produces false positives from DB-dialect differences, use it on the premise of **always visually checking the generated result** after enabling.

---

## **3. Naming conventions: make constraint names deterministic and stabilize autogenerate diffs**

There was a reason I repeatedly said in §2's table "an **explicitly-named** unique constraint" and "doesn't detect an **anonymous** constraint." It's because Alembic can track a constraint's diff only **when the constraint has a stable name.**

But if you write `UniqueConstraint(...)` or `ForeignKey(...)` without specifying a name, the constraint name is left to the DB backend. In the example the official cites, Oracle generates **a code that can change every time** like `SYS_C0029334`. With this, autogenerate can't recognize it as "the same constraint," the diff becomes unstable, and you can't write `DROP CONSTRAINT` either.

### **H3: Centralize names with `MetaData(naming_convention=...)`**

The solution is to set a **naming convention** on `MetaData` and have all constraint names deterministically auto-generated. The convention dictionary the official recommends is this.

```python
from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase

NAMING_CONVENTION = {
    "ix": "ix_%(column_0_label)s",
    "uq": "uq_%(table_name)s_%(column_0_name)s",
    "ck": "ck_%(table_name)s_%(constraint_name)s",
    "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
    "pk": "pk_%(table_name)s",
}


class Base(DeclarativeBase):
    metadata = MetaData(naming_convention=NAMING_CONVENTION)
```

The meaning of each token is as follows.

| Token | Kind of constraint | Generation example |
| --- | --- | --- |
| `ix` | index | `ix_users_email` |
| `uq` | unique constraint | `uq_users_email` |
| `ck` | check constraint | `ck_users_status` |
| `fk` | foreign key | `fk_orders_user_id_users` |
| `pk` | primary key | `pk_users` |

**Why is this decisively important?**
Setting this convention on `Base.metadata` automatically gives a consistent name even to constraints whose name you didn't explicitly write. As a result,

1. **autogenerate diffs stabilize**: since constraint names are the same every time, Alembic can correctly judge "no change," and noise diffs disappear.
2. **You can write `DROP CONSTRAINT` / `ALTER CONSTRAINT`**: since names are predictable, you can reliably operate on constraints in later migrations.
3. **DB portability rises**: the same name on PostgreSQL and other DBs.

This is a typical practice of "ETC (Easy To Change)" and "DRY." The naming rule is consolidated in one place and applied consistently to all constraints. **This setting is the foundation you should definitely put in first if you put autogenerate into production operation.** Introducing it later causes a flood of rename diffs of existing constraints, so put it in on day one of the project.

---

## **4. Migration operation: `upgrade` / `downgrade`, CI application, batch operations**

A generated migration script is placed in `alembic/versions/` with the following skeleton.

```python
"""add user table

Revision ID: 1975ea83b712
Revises: 9a8b7c6d5e4f
"""
revision = "1975ea83b712"        # このスクリプト自身の ID
down_revision = "9a8b7c6d5e4f"   # 1 つ前のスクリプトの ID（リビジョンの鎖）


def upgrade() -> None:
    op.create_table(
        "users",
        sa.Column("id", sa.Integer(), primary_key=True),
        sa.Column("email", sa.String(255), nullable=False),
    )
    op.create_index("ix_users_email", "users", ["email"], unique=True)


def downgrade() -> None:
    op.drop_index("ix_users_email", "users")
    op.drop_table("users")
```

`revision` and `down_revision` make **a directed chain,** and Alembic traces this chain to decide "in what order and how far to apply." The principle is to write `downgrade()` as the **reverse operation** of `upgrade()`, in the reverse order of application.

### **H3: The main commands used in daily operation**

| Command | Behavior |
| --- | --- |
| `alembic upgrade head` | apply all unapplied migrations up to the latest (head) |
| `alembic upgrade +2` | apply only the next 2 |
| `alembic downgrade -1` | revert by 1 |
| `alembic downgrade base` | roll back all to the initial state |
| `alembic current` | display the revision the DB currently has applied |
| `alembic history` | display the migration history (details with `--verbose`) |
| `alembic heads` | the current head list (the branched state of §6 if there are multiple) |

A revision ID can be partially specified if unambiguous (`ae1` points to `ae1027a6acf`).

### **H3: Verify "the correctness of migrations" in CI**

Before shipping a schema change to production, mechanically verifying the following in CI becomes the crux of resilience (Reliability).

```bash
# 空の使い捨て DB に対して、head まで適用 → 1 つ戻す → 再度適用 が通るか
alembic upgrade head
alembic downgrade -1
alembic upgrade head
```

**Why does this work?**
A migration that forgot to write `downgrade()`, or whose reverse operation is wrong, definitely fails on this round trip. A migration that "can go forward but can't go back" means losing the rollback means at a production failure. Forcing this round trip in CI lets you **detect a non-rollbackable migration before it's merged.** Furthermore, re-running autogenerate with `--autogenerate` and confirming "the diff is empty (= model and DB match)" can also detect write omissions.

### **H3: batch operations — circumventing SQLite's `ALTER` constraints**

When using SQLite in local development or testing, you face a serious constraint. Per the official, SQLite "barely supports the `ALTER` statements that schema migration depends on." An operation like `ALTER TABLE ... DROP COLUMN` doesn't work as-is.

Alembic's solution is the **batch operation.** Internally it automatically performs the "move and copy" workflow — reflect the existing table, create a new table with the changes applied, copy the data with `INSERT ... SELECT`, drop the old table, and rename the new table to the original name.

```python
def upgrade() -> None:
    with op.batch_alter_table("users") as batch_op:
        batch_op.add_column(sa.Column("phone", sa.String(20)))
        batch_op.drop_column("legacy_field")
```

Furthermore, if you want autogenerate to output in this batch syntax, pass `render_as_batch=True` to `context.configure(...)` in `env.py`.

```python
context.configure(
    connection=connection,
    target_metadata=target_metadata,
    render_as_batch=True,
)
```

> 💡 **On PostgreSQL, batch is effectively a no-op**: per the official, "the batch context performs move and copy by default only when SQLite is being used." That is, even if you enable `render_as_batch=True`, on production PostgreSQL a normal `ALTER` is issued. Even with a "test on SQLite, production on PostgreSQL" configuration, the same migration code works on both — this is the value of putting in batch.

---

## **5. Zero-downtime schema changes: the expand/contract pattern (most important)**

From here is the core of production operation. The schema change itself, Alembic flows safely. The problem occurs when you change the schema **while the application keeps running.**

If you naively run "add a NOT NULL new column to the model and `alembic upgrade head`," what happens? In the transition period of a deploy, **new code premised on the new schema** and **old code premised on the old schema** temporarily coexist. If you immediately impose NOT NULL on the new column, the old code's `INSERT` — which doesn't know that column — is wiped out with a constraint violation, and the service crashes.

The standard to avoid this is the **expand / contract pattern (Parallel Change).** Divide the schema change into multiple phases of "**expand**" and "**contract**," and always keep, in each phase, a state where **both old and new code can run.**

### **H3: Safely do "adding and renaming a column" in 4 deploys**

As an example, consider migrating `users.name` to the stricter `users.full_name` (ultimately NOT NULL). As mentioned, if you do this directly with `alter_column(new_column_name=...)`, the old code breaks in the transition by referencing a column that doesn't exist. This is the reason **you must not directly rename.**

| Phase | Migration | App code | Safety |
| --- | --- | --- | --- |
| **① Expand** | **add `full_name` as nullable** | the old code reads/writes `name` (unchanged) | the new column being empty bothers no one |
| **② Dual-write** | (none) | deploy new code: write to **both** `name` and `full_name` | even if old code remains, the two are consistent |
| **③ Backfill** | fill the existing rows' `full_name` with `op.execute` | once all rows are filled, switch reading to `full_name` | past data is also aligned |
| **④ Contract** | **make `full_name` NOT NULL** → drop the old column `name` | only new code runs | the old code is gone |

Each phase becomes an independent deploy. The migration code is this.

```python
# フェーズ①：拡張（nullable で追加するのでロックは一瞬。誰も壊れない）
def upgrade() -> None:
    op.add_column("users", sa.Column("full_name", sa.String(255), nullable=True))


# フェーズ③：バックフィル（スキーマ変更ではなくデータ移行）
def upgrade() -> None:
    op.execute("UPDATE users SET full_name = name WHERE full_name IS NULL")


# フェーズ④：収縮（全行が埋まり、旧コードが消えてから初めて NOT NULL 化）
def upgrade() -> None:
    op.alter_column("users", "full_name", nullable=False)
    op.drop_column("users", "name")
```

**Why is this multi-stage deploy necessary?**
The key is keeping a state where "**at any moment, the running code doesn't contradict the schema.**" In the expand phase you add the column as nullable so the old code is unharmed; with dual-write the data of both lines is aligned so the read switch is safe; with backfill the past portion is filled too so making it NOT NULL holds. Each step proceeds to the next only after satisfying the premise "the previous step is complete" — this is the essence of zero downtime.

> ⚠️ **Beware of long locks on large tables**: on PostgreSQL, `ALTER TABLE ... SET NOT NULL` and a non-concurrent `CREATE INDEX` take a strong lock on the entire table and block `INSERT`/`UPDATE` during that. Doing this on a table of tens of millions of rows times out the app on lock wait. The countermeasures are PostgreSQL-specific techniques like (1) create the index concurrently with `op.create_index(..., postgresql_concurrently=True)` (note that this migration needs to run outside a transaction), and (2) for making it NOT NULL, first apply a `CHECK (col IS NOT NULL) NOT VALID` constraint, `VALIDATE` it, then `SET NOT NULL`. **Always confirm the target DB's behavior of the lock characteristics of schema changes in advance.**

### **H3: Pass DBA review with offline mode (`--sql`)**

There are not a few organizations that must not `upgrade` the production DB directly from the app. Operation requiring a DBA to review the SQL and apply it manually in a maintenance window is demanded.

Alembic's **offline mode** can, without connecting to the DB, just **spit out the SQL that would be applied to standard output.**

```bash
# 接続せず、適用予定の SQL を生成して DBA に渡す
alembic upgrade head --sql > migration.sql

# 特定リビジョン間の差分だけを SQL 化することも可能
alembic upgrade 9a8b7c:1975ea --sql > step.sql
```

**Why is this superior?**
The generated `migration.sql` becomes an artifact a human (DBA) can review before execution. Whether a `DROP` is included, whether there's a strong-lock `ALTER`, what the impact on the execution plan is — these can be verified on paper **before execution.** Without giving the application production-DB DDL privileges, you can carve out only the schema change into a safe path while maintaining the Principle of Least Privilege.

---

## **6. Branching and merging: resolving multiple heads**

In team development, two developers may create migrations **separately from the same parent revision.** For example, if A creates `ae1027` with `1975ea` as the parent and B creates `27c6a3` with the same `1975ea` as the parent, after merging **two heads** are born.

In this state, hitting `alembic upgrade head` errors because the destination is ambiguous. First confirm the heads.

```bash
alembic heads          # 現在の head 一覧
alembic branches --verbose  # 分岐点を表示
```

To resolve it, use `alembic merge` to create a merge revision that bundles both heads into one.

```bash
alembic merge -m "merge ae1027 and 27c6a3" ae1027 27c6a3
# あるいは全 head を一括で
alembic merge -m "merge heads" heads
```

The generated merge revision has `down_revision` as **a tuple pointing to both parents.**

```python
down_revision = ("ae1027a6acf", "27c6a30d7c24")
```

With this, the chain again converges to one head, and `alembic upgrade head` passes. **The exact same idea as a Git branch merge** — a branch isn't evil; you just explicitly merge and unify the history into one. If you understand it this way, it's not scary.

---

## **Conclusion: evolve the schema "just like code"**

Alembic is the standard tool of the SQLAlchemy ecosystem that makes schema changes reproducible, reversible, and reviewable. Let me re-list the key points of this article.

1. **`env.py`'s `target_metadata = Base.metadata`** is the starting point of everything in autogenerate.
2. **autogenerate is a "draft."** Accurately grasp the changes it can't detect (renames, anonymous constraints, sequences, etc.) and always review the generated script.
3. Make constraint names deterministic with **`MetaData(naming_convention=...)`** and stabilize diffs (put it in on day one).
4. Verify the round trip of **`upgrade → downgrade → upgrade` in CI** and reject non-rollbackable migrations.
5. With **batch operations (`render_as_batch=True`),** reconcile SQLite testing and PostgreSQL production with one code.
6. With the **expand/contract pattern,** design a multi-stage deploy that doesn't contradict the running app, achieving zero downtime.
7. Pass DBA review with **offline mode (`--sql`)** and maintain least privilege.

The difference between "a working schema change" and "a schema that can safely evolve without downtime" lies in the accumulation of each of these design judgments. The design of the model Alembic migrates is explained in [SQLAlchemy 2.0 practical guide](/blog/sqlalchemy-2-typed-orm-production-guide), and the operation of the API layer that carries it in [FastAPI production-operation guide](/blog/fastapi-production-async-pydantic-observability-guide).

For further exploration, I recommend re-reading the following of the official documentation from this article's operation viewpoint.

- [Alembic Tutorial](https://alembic.sqlalchemy.org/en/latest/tutorial.html) (basic commands and script structure)
- [Auto Generating Migrations](https://alembic.sqlalchemy.org/en/latest/autogenerate.html) (the accurate range of detection capability)
- [The Importance of Naming Constraints](https://alembic.sqlalchemy.org/en/latest/naming.html) (naming conventions)
- [Running "Batch" Migrations for SQLite and Other Databases](https://alembic.sqlalchemy.org/en/latest/batch.html) (batch operations)
- [Working with Branches](https://alembic.sqlalchemy.org/en/latest/branches.html) (branching, merging)
- [Operation Reference](https://alembic.sqlalchemy.org/en/latest/ops.html) (the full reference of `op.*`)

---

### **Consultation on production-DB schema migration / zero-downtime-ization**

The author implemented and operated the schema evolution with Alembic explained here in the production environment of a METI-Minister's-Award-winning B2B SaaS (SQLAlchemy 2.0 / PostgreSQL 16). Also, on a serverless payment platform, I achieved **0 double charges in production** with zero-downtime migration and idempotency design. I build, fast and at high quality leveraging generative AI, **the database foundation directly tied to the business's reliability** — non-downtime schema changes via expand/contract, developing naming conventions, round-trip verification in CI, and designing a DBA-review path. Please feel free to consult me about production-DB schema migration, zero-downtime-ization, and improving existing migration operation.