# Flask Testing Practical Guide: Writing Production-Quality Automated Tests with pytest fixtures, test_client, and test_cli_runner

> A complete guide to writing Flask 3.1-series tests at production quality. Explained with real code faithful to the official documentation: why the application factory and test_client make testing easy, the effect of TESTING=True, the app/client/runner pytest-fixture trio, request verification with response.data/json/text, follow_redirects, and session_transaction, test_request_context and app_context, and CLI testing with test_cli_runner.

- Published: 2026-06-23
- Author: 友田 陽大
- Tags: Python, Flask, pytest, テスト, バックエンド, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/flask-testing-pytest-test-client-fixtures-guide
- Category: Flask in production
- Pillar guide: https://tomodahinata.com/en/blog/flask-production-guide

## Key points

- Flask is easy to test thanks to the application factory and test_client(). You can round-trip requests without standing up a real server, and make an app with test-only settings every time
- The canonical fixtures are the app/client/runner trio. pytest injects by matching the fixture name and the argument name, and code after yield handles post-test cleanup (DB disposal, etc.)
- Verify the body with response.data (bytes) / response.json / response.text, and follow redirects with follow_redirects and response.history. Verify Location with the status and the header
- Verify session/g by reading after the request with `with client:`, and prepare a login premise by injecting the session first with client.session_transaction()
- Use test_request_context for unit-testing a function that reads request, and app_context for DB-layer tests. Verify CLI commands with test_cli_runner + monkeypatch

---

## **Introduction: Flask testing is "the reward of design"**

A backend that's hard to test almost always has **design distortion** — a global `app` placed at the module top so settings can't be swapped, a hardcoded DB connection that can't be pointed at a test DB, a view function with business logic fused into it so it can't be hit alone. This kind of "hardness to test" is directly "hardness to change."

Conversely, **whether tests can be written straightforwardly is the litmus paper for whether that Flask app is production quality.** And Flask, if it adopts the application factory (`create_app`) explained in the [pillar article](/blog/flask-production-guide), is a framework astonishingly easy to test. The `test_client()` that round-trips requests without standing up a real server, the factory that makes an app with test-only settings every time, and pytest fixtures — when these 3 mesh, the design of "fixing the boundary's contract with tests" is realized with minimal effort.

This article is a spoke that deep-dives §9 of the pillar as a dedicated article. The author has **designed and implemented the backend of a B2B SaaS that won the Minister of Economy, Trade and Industry Award in Python / Flask / SQLAlchemy / PostgreSQL, and operated it in production on API Gateway → ALB → ECS (Fargate).** What's shown here is the pattern of tests that kept preventing regressions in that real combat.

> 💡 **The version covered in this article**: it assumes the **Flask 3.1 series** (minimum Python 3.9) and **pytest**. All code is based on the patterns of the official documentation's ([flask.palletsprojects.com](https://flask.palletsprojects.com/en/stable/testing/)) testing guide and tutorial. E2E (front-included flow tests via a browser) is out of scope; that layer is split into the paired [Playwright E2E test design guide](/blog/playwright-e2e-testing-production-design-guide). This article concentrates on **Flask-side server tests.**

---

## **1. Why Flask testing is easy: `test_client` and `TESTING=True`**

### 1.1 `test_client()`: round-trip requests without standing up a server

A Flask app is a WSGI application, riding on top of Werkzeug. Using this property, Flask provides **a test client that can send requests to the app without launching an actual HTTP server.**

```python
client = app.test_client()
response = client.get("/health")
```

`test_client()` drives the app's dispatch directly at the WSGI level, without going through the network or a dev server. So it's fast, has no port conflicts, and is stable in CI. With `use_cookies=True` (the default), it retains cookies between requests, so you can write a flow of multiple requests that carry over login state.

### 1.2 What `TESTING=True` changes

Before writing tests, always **set `app.testing = True` (or the setting `TESTING=True`).** Borrowing the official words, "`TESTING` tells Flask the app is in test mode, and Flask changes some internal behaviors to make testing easier." Concretely, what's important is the following.

- **Propagate exceptions without swallowing them.** Normally, an exception raised in a view is caught by Flask's error handler and converted to a 500 response. In test mode, this **propagates to the test side as an exception**, so you can chase "why a 500 is returned" **with the stack trace.** Without this, you can't see the real cause of a failed assertion.

```python
def test_config():
    # ファクトリにTESTINGを渡さなければtestingはFalse
    assert not create_app().testing
    # 渡せばTrueになる——設定が効いていることをまず固定する
    assert create_app({"TESTING": True}).testing
```

> ⚠️ **The trap of forgetting to set `TESTING`**: forget to set `TESTING=True` and a real bug that occurred in the view (an attribute error, a type error) turns into a 500 response, and the test only tells you "the status isn't 200." A typical case of melting useless time into debugging. Make **always passing `TESTING=True` at the point of making the app in the fixture** the first discipline of test design.

---

## **2. The fixture trio: `app` / `client` / `runner`**

### 2.1 The canonical form: 3 fixtures injected by name

pytest **looks for a fixture with a name matching the test function's argument name and auto-injects it.** In Flask testing, the `app` / `client` / `runner` trio riding on this mechanism is canonical. Place this in `tests/conftest.py` and it's usable from all test files under it with just the argument name.

```python
# tests/conftest.py
import pytest

from my_project import create_app


@pytest.fixture()
def app():
    app = create_app()
    app.config.update({"TESTING": True})

    # ここに setup 処理（DB作成など）を書ける
    yield app
    # yield の後ろが teardown（後始末）。ここに破棄処理を書く


@pytest.fixture()
def client(app):
    return app.test_client()


@pytest.fixture()
def runner(app):
    return app.test_cli_runner()
```

Let me organize each one's role.

| fixture | What it produces | What it tests |
|---|---|---|
| `app` | The app made with `create_app({"TESTING": True})` | The app itself, settings, the DB layer needing `app_context` |
| `client` | `app.test_client()` | The round-trip of HTTP request/response (views, routing) |
| `runner` | `app.test_cli_runner()` | CLI commands defined with `@app.cli.command` |

Note that `client` and `runner` take `app` as an argument. pytest automatically does the **dependency resolution** of "for a test that uses `client`, first resolve the `app` fixture, then pass it." A fresh `app` is made per test, and state doesn't leak between tests.

> 💡 **Why this can't be written without the factory**: the one line `create_app({"TESTING": True})` in the fixture is the biggest dividend of the application factory. In a design with a global `app` at the module top, the app is settled at import time, and there's no opening to inject test settings. "Testability" is not a future requirement but a **current requirement**, and the factory is the separation to meet it. For details, see the [large-structure guide](/blog/flask-application-factory-blueprints-large-app-structure-guide).

### 2.2 The temporary-DB version: confine setup and teardown to the fixture

Tests of a real app involve a DB. The official tutorial's conftest shows a pattern of **making a temporary-file SQLite per test and surely deleting it after the test.** The pytest fixture structure of setup before `yield`, teardown after, cleanly confines the resource lifecycle.

```python
import os
import tempfile

import pytest

from flaskr import create_app
from flaskr.db import get_db, init_db

# テスト用の初期データSQLをあらかじめ読み込んでおく
with open(os.path.join(os.path.dirname(__file__), "data.sql"), "rb") as f:
    _data_sql = f.read().decode("utf8")


@pytest.fixture
def app():
    # 一時ファイルを作り、そのパスをDATABASE設定に渡す
    db_fd, db_path = tempfile.mkstemp()

    app = create_app({"TESTING": True, "DATABASE": db_path})

    with app.app_context():
        init_db()                                  # スキーマを作る
        get_db().executescript(_data_sql)          # 初期データを投入

    yield app

    # teardown：一時DBを閉じて削除する。テスト間で状態を持ち越さない
    os.close(db_fd)
    os.unlink(db_path)
```

This fixture's `with app.app_context():` is important. Because `init_db()` and `get_db()` reference `current_app`, they crash with `RuntimeError: Working outside of application context` unless inside an application context (for context details, see the [context thorough explanation](/blog/flask-application-request-context-g-current-app-guide)). By pushing the context in the fixture, you can safely run DB initialization.

> 💡 **The trade-off of fixture scope and "cleanliness"**: the fixture above is the default `function` scope, and because it **re-creates the DB per test function** it's the cleanest, but it gets slow as the count grows. To speed it up, there's also a design of reducing the number of times the DB is created with `@pytest.fixture(scope="session")` and isolating each test by rolling back a transaction. But sacrifice "independence between tests" for "speed," and you produce the worst flake of order-dependent tests. First ensure correctness with `function` scope, and optimize after slowness becomes a measured problem — the iron rule is don't optimize on speculation.

---

## **3. Request tests: distinguish `response.data` / `json` / `text`**

### 3.1 GET and body verification

`client.get(path)` returns a response object. For the body, distinguish 3 properties according to the expected format.

```python
def test_hello(client):
    response = client.get("/hello")
    # response.data は bytes。バイト列リテラル（b"...")と比較する
    assert response.data == b"Hello, World!"
```

| Accessor | Type | Use |
|---|---|---|
| `response.data` | `bytes` | The raw response body (byte string) |
| `response.json` | `dict` / `list` | The Python object parsed from a JSON response |
| `response.text` | `str` | The body decoded as text (synonymous with `get_data(as_text=True)`) |
| `response.status_code` | `int` | The status code |
| `response.headers` | dict-like | The response headers (`Location`, etc.) |

In REST API tests, `response.json` is the lead. You can receive a response returned with `jsonify` as a parsed dict, so you can compare the body's structure directly as a dictionary.

```python
def test_health_returns_json(client):
    response = client.get("/health")
    assert response.status_code == 200
    assert response.json == {"status": "ok"}      # パース済みdictを直接比較
```

### 3.2 Verifying POST, headers, and `Location`

For POST, pass `data=` for a form submission and `json=` for a JSON body. For an endpoint returning a redirect, verify **the status and the `Location` header.**

```python
def test_register(client, app):
    # GETでフォーム画面が出ることを確認
    assert client.get("/auth/register").status_code == 200

    # POSTで登録 → 成功するとログイン画面へリダイレクトする
    response = client.post(
        "/auth/register", data={"username": "a", "password": "a"}
    )
    # リダイレクト先はLocationヘッダで検証する
    assert response.headers["Location"] == "/auth/login"

    # 副作用（DBに行が作られたか）はapp_contextの中で直接確認する
    with app.app_context():
        assert (
            get_db()
            .execute("SELECT * FROM user WHERE username = 'a'")
            .fetchone()
            is not None
        )
```

The essence this test shows is **verifying both "the HTTP response (Location)" and "the side effect (the DB's state)."** Even if the redirect destination is correct, if it's not written to the DB it isn't functioning, and vice versa. This combination of round-tripping a request with `client` while peeking directly at the DB with `app.app_context()` is the pattern for testing registration-system endpoints.

### 3.3 `follow_redirects`: follow redirects

When you want to verify including the redirect destination, pass `follow_redirects=True` and it **tracks to the final page.** The intermediate transitions go into `response.history`, and the finally-reached URL into `response.request.path`.

```python
def test_logout_redirects_to_index(client):
    response = client.get("/logout", follow_redirects=True)
    # 1回リダイレクトが起きたことを確認
    assert len(response.history) == 1
    # 最終的にトップへ着地したことを確認
    assert response.request.path == "/"
```

> 💡 **`Location` verification or `follow_redirects`**: if you want to fix only "where it sends" as the contract, not using `follow_redirects` and directly asserting `response.headers["Location"]` is lightweight and clear in intent (the 3.2 example). On the other hand, if you want to see up to "whether the destination page renders correctly," track with `follow_redirects=True`. The two **verify different contracts**, so choose to match the test's intent.

---

## **4. Session testing: the tool differs between reading and injecting**

In an app involving login, verifying `session` is unavoidable. Flask provides separate tools for the case of "**reading the session after the request**" and "**injecting the session before the request**." Confuse these and you get stuck with a `RuntimeError`.

### 4.1 Read: peek at `session` / `g` after the request with `with client:`

Normally, `session` and `g` can't be touched outside a request (because there's no context). Send a request inside a `with client:` block, and **that request's context is retained until the block ends**, and you can read the `session` after the request.

```python
from flask import session


def test_access_session(client):
    with client:
        client.post("/auth/login", data={"username": "flask"})
        # ログイン直後、サーバーがセッションに書いた値を検証できる
        assert session["user_id"] == 1
    # with ブロックを抜けると session はもうアクセスできない
```

Whether the login process correctly set `session["user_id"]` — this **server-internal side effect** can be directly confirmed without going through the response body, which is the value of this technique. A value put in `g` can be peeked at similarly.

### 4.2 Inject: write the session before the request with `client.session_transaction()`

Conversely, when you want to test an endpoint premised on "an already-logged-in user" (e.g. `/users/me`), **inject the session before the request.** Write the session inside the `client.session_transaction()` block, and **it's saved as a signed cookie at block end** and carried over to subsequent requests.

```python
def test_users_me_with_logged_in_user(client):
    # ログインフローを毎回叩く代わりに、セッションを直接仕込む
    with client.session_transaction() as session:
        session["user_id"] = 1

    response = client.get("/users/me")
    assert response.status_code == 200
    assert response.json["id"] == 1
```

This is extremely useful as a **login shortcut.** Round-tripping a login POST every time to test 20 endpoints requiring authentication is slow and verbose. Build the session directly with `session_transaction()` and you can prepare the premise of "authenticated" in 3 lines.

### 4.3 Reuse: make an authenticated client a fixture

Carve this pattern out into a fixture and the whole authentication test becomes dramatically more readable.

```python
# tests/conftest.py（追加）
@pytest.fixture()
def auth_client(client):
    """user_id=1 でログイン済みの client を返す。"""
    with client.session_transaction() as session:
        session["user_id"] = 1
    return client
```

```python
# 認証が要るエンドポイントは auth_client を引数に取るだけ
def test_dashboard_requires_auth(client):
    # 未ログインは弾かれる（リダイレクトや401）
    assert client.get("/dashboard").status_code in (302, 401)


def test_dashboard_shows_for_authed_user(auth_client):
    # ログイン済み前提でビジネスロジックを検証する
    assert auth_client.get("/dashboard").status_code == 200
```

The access-control contract of "reject the unauthenticated, pass the authenticated" can be written declaratively just by switching fixtures. This is the crux of production-quality tests that prevent authorization regressions.

---

## **5. `test_request_context`: unit-test a function that reads `request`**

It's common for a **helper function** called from within a view function to directly reference `request` (form values, queries, JSON). What you use when you want to unit-test such a function without going through HTTP's full dispatch is `app.test_request_context()`.

`test_request_context()` takes Werkzeug's `EnvironBuilder` arguments (`path` / `method` / `data` / `json` / `query_string` / `headers` …) and **pseudo-creates only a request context**, letting you use `request` inside that block.

```python
def test_validate_user_edit(app):
    # /user/2/edit に空のnameをPOSTした状況を擬似的に作る
    with app.test_request_context(
        "/user/2/edit", method="POST", data={"name": ""}
    ):
        # validate_edit_user() は内部で request.form を読む
        messages = validate_edit_user()

    assert messages["name"][0] == "Name cannot be empty."
```

You can test only the validation logic **targeted and fast**, rather than hitting the whole view (`client.post`). It's well-matched as a unit test of boundary validation, and isolating the cause is also easy (the thinking of designing the boundary with a schema is consistent with the [boundary-design guide of marshmallow × Flask × SQLAlchemy](/blog/marshmallow-flask-sqlalchemy-rest-api-production-guide) — the same philosophy of fixing the boundary's contract with tests).

> ⚠️ **`before_request` isn't called**: because `test_request_context()` **doesn't run the dispatch code**, the preprocessing registered with `@app.before_request` (the auth check, loading a value into `g`, etc.) **doesn't run.** If the test target function depends on those premises, explicitly call `app.preprocess_request()` inside the context.

```python
def test_handler_that_depends_on_before_request(app):
    with app.test_request_context("/orders", json={"qty": 3}):
        app.preprocess_request()   # before_request を手動で起動する
        result = handle_create_order()
    assert result["qty"] == 3
```

In a test via `client`, `before_request` naturally runs, so this caution is **specific to unit tests using `test_request_context`.** Choose `client` if you want to reproduce the full request lifecycle, and `test_request_context` if you want to target only the logic.

---

## **6. `app_context`: test the DB layer not tied to a request**

There are times you want to test a pure **data-access layer** (`get_db()` or ORM query functions) that doesn't use `request` at all. These reference `current_app`, so they need an **application context**, but **a request context is unneeded.** The tool for that is `app.app_context()`.

```python
def test_get_db_returns_same_connection(app):
    # リクエストは無いが、current_app が要るので app_context を push
    with app.app_context():
        db = get_db()
        # 同一コンテキスト内では同じ接続が再利用される（g にキャッシュ）
        assert db is get_db()


def test_close_db_after_context(app):
    with app.app_context():
        db = get_db()

    # コンテキストを抜けると teardown_appcontext で接続が閉じられる
    import pytest
    with pytest.raises(Exception):
        db.execute("SELECT 1")   # 閉じた接続を使うと例外になる
```

Let me organize the distinction between `test_request_context` (both request and app contexts) and `app_context` (the app context only).

| Tool | Pushed context | Can use `request`? | Main use |
|---|---|---|---|
| `client.get(...)` | Request + app (full dispatch) | Yes | Views, routing, E2E-ish round-trips |
| `app.test_request_context()` | Request + app (no dispatch) | Yes | Unit-testing a helper function that reads `request` |
| `app.app_context()` | App only | No | DB layer / CLI-ish processing needing only `current_app` |

Choose the tool by "what context the test target needs" — this is the core of the design judgment of Flask testing.

---

## **7. Testing CLI commands: `test_cli_runner` + `monkeypatch`**

A Flask app inevitably has **admin commands** defined with `@app.cli.command` (DB init, data migration, batch). These too can be tested with `app.test_cli_runner()` (Click's `CliRunner` extended for Flask). Having no tests for the production DB-init script or data-migration command is a breeding ground for operational accidents.

### 7.1 Verify the output

```python
import click


@app.cli.command("hello")
@click.option("--name", default="World")
def hello_command(name):
    click.echo(f"Hello, {name}!")
```

```python
def test_hello_command(runner):
    # 引数なしで呼ぶ → デフォルトの "World"
    result = runner.invoke(args="hello")
    assert "World" in result.output

    # オプションを渡す → その値が出力に出る
    result = runner.invoke(args=["hello", "--name", "Flask"])
    assert "Flask" in result.output
```

`result.output` of `runner.invoke()`'s return value holds the standard output as a string, so you can assert what `click.echo` output as-is.

### 7.2 `monkeypatch`: swap out heavy-side-effect processing

For **a command with heavy side effects** like `init-db`, you sometimes want to verify only "whether the correct function is called and the correct message is output" without running the real DB init. Swap the internal function with pytest's `monkeypatch` and record the call.

```python
def test_init_db_command(runner, monkeypatch):
    # 本物の init_db を呼んだかどうかを記録するだけのスタブ
    class Recorder:
        called = False

    def fake_init_db():
        Recorder.called = True

    # init_db を fake に差し替える（本物のDB初期化は走らない）
    monkeypatch.setattr("flaskr.db.init_db", fake_init_db)

    result = runner.invoke(args=["init-db"])

    assert "Initialized" in result.output   # ユーザー向けメッセージが出る
    assert Recorder.called                  # 実際に init_db が呼ばれた
```

The point here is the separation of concerns of **testing only "the command's responsibility (tell the user the result, launch the correct processing function)," and testing the contents of the DB init beyond it separately.** You don't need to re-create the real DB every time in the command's test. Because `monkeypatch` automatically restores when the function's scope ends, there's no worry of the swap leaking between tests.

---

## **8. Production test strategy: fix the happy path and error path of API endpoints**

Now that the parts are in place, let me assemble **the tests you actually write in production.** The subject is a typical API of "receive JSON, validate, save, and return 201." The test's purpose is to fix that **contract** — "valid input is 201, invalid input is an error, the unauthenticated is rejected."

```python
# tests/test_orders.py
def test_create_order_returns_201(auth_client):
    """正常系：妥当な入力で 201 と作成済みリソースが返る。"""
    response = auth_client.post(
        "/api/orders",
        json={"product_id": 10, "quantity": 3},
    )
    assert response.status_code == 201
    body = response.json
    assert body["quantity"] == 3
    assert "id" in body                       # サーバーが採番したidが返る
    assert "internal_cost" not in body        # 内部項目は出力に漏れない


def test_create_order_rejects_invalid_quantity(auth_client):
    """異常系：負の数量は 400/422 で弾かれる。"""
    response = auth_client.post(
        "/api/orders",
        json={"product_id": 10, "quantity": -1},
    )
    assert response.status_code in (400, 422)
    # フロントがどの欄にエラーを出すか判定できるよう、フィールド名を返す
    assert "quantity" in response.json["errors"]


def test_create_order_requires_auth(client):
    """認可：未認証は本処理に到達せず弾かれる。"""
    response = client.post(
        "/api/orders",
        json={"product_id": 10, "quantity": 3},
    )
    assert response.status_code in (302, 401)


def test_get_missing_order_returns_404(auth_client):
    """存在しないリソースは 404。"""
    assert auth_client.get("/api/orders/99999").status_code == 404
```

What these 4 cover is nearly all patterns of accidents that occur at an API's boundary — **the happy path, the error path of input validation, authorization, and resource non-existence.** Note too that thanks to the `auth_client` fixture (§4.3), the authenticated-premise tests are written declaratively.

> 💡 **Fix the boundary's contract with tests** — this is fully consistent with the philosophy repeatedly emphasized in the [marshmallow × Flask × SQLAlchemy REST API guide](/blog/marshmallow-flask-sqlalchemy-rest-api-production-guide). That one handles the design of "protect the entrance with `load()`, the exit with `dump()`, with schema declarations," and this one handles the test of "round-trip-verify and fix the behavior of that boundary with `test_client`." Protect the boundary with the **double of declaration (schema) and test (fixing the contract)** — this is the condition of a production-quality API. The shaping of error responses and how to return 422 themselves are dug into in the [error-handling / observability guide](/blog/flask-error-handling-logging-observability-guide).

### 8.1 Coverage and CI: keep tests "always green"

Tests aren't done once written; they become a fortress against regressions only when they **keep running on every commit in CI.**

```bash
# coverage 付きで実行（pytest-cov）
pytest --cov=src/myapp --cov-report=term-missing

# CI では失敗時に分かりやすく
pytest -q --maxfail=1
```

Use coverage **as a diagnosis, not a goal.** Blindly chase 100% and you fall into the inverted situation of padding the number with meaningless tests. What matters is "**whether the boundaries (entrance, exit, authorization, errors) are covered**," not the line-coverage number itself. Look at "untested lines" with `--cov-report=term-missing`, and if they're boundaries or error paths, fill them — this is a healthy way to use it.

---

## **9. Choosing the test DB: SQLite, or real PostgreSQL**

Finally, let me honestly handle a realistic and opinion-dividing point. **Which DB engine to run tests against.**

The official tutorial uses **a temporary-file SQLite**, as seen in §2.2. This is fast, needs no additional infrastructure, and is handy in CI too. For small-to-medium-scale apps or apps where SQL stays within a standard range, this has plenty of value.

But if production is **PostgreSQL**, testing with SQLite causes oversights stemming from **engine differences.**

| Aspect | SQLite (temp file / in-memory) | The same PostgreSQL as production |
|---|---|---|
| Speed / handiness | ◎ No additional infrastructure, fast | △ Need to prepare a container, etc. |
| Behavior of types / constraints | △ Types are loose, some constraints don't work | ◎ Matches production |
| Concurrency control / transaction isolation | △ Behavior differs | ◎ Matches production |
| Postgres-specific features (JSONB, arrays, `ON CONFLICT`, partial indexes) | ✗ Unsupported / differs | ◎ Can verify as-is |

The author's real-combat conclusion is **"distinguish by layer."**

- For the vast majority of tests that don't depend on the DB engine — validation logic, schema units, view branching — run fast with **SQLite.**
- For tests involving migrations, Postgres-specific features, transaction boundaries, and unique-constraint conflicts, run against **the same PostgreSQL container as production** in CI.

> ⚠️ **The trap of overconfidence in "do it all with SQLite"**: even if all tests using SQLite are green locally, it's not rare to have an accident where production PostgreSQL fails on a unique-constraint conflict or a JSONB query. **Having at least a minimal integration test against the same engine as production in CI** is the insurance that fills this gap. Speed (SQLite) and production match (Postgres) are a trade-off; rather than going all-in on one, allocate by the test's nature is the realistic answer. For the behavior of the persistence layer itself, see the [SQLAlchemy 2.0 practical guide](/blog/sqlalchemy-2-typed-orm-production-guide).

A minimal example of using Postgres in CI is as follows (a GitHub Actions service container).

```yaml
# .github/workflows/test.yml（抜粋）
services:
  postgres:
    image: postgres:17
    env:
      POSTGRES_PASSWORD: postgres
    ports:
      - 5432:5432
    options: >-
      --health-cmd pg_isready --health-interval 10s
      --health-timeout 5s --health-retries 5
```

```bash
# テスト時はこのDBへ向ける（秘密はコードに書かない）
export FLASK_SQLALCHEMY_DATABASE_URI='postgresql+psycopg://postgres:postgres@localhost:5432/test'
pytest
```

With this, a two-layer setup of fast with SQLite locally and hard against the same Postgres as production in CI is complete.

---

## **Summary: testability is the quality of design itself**

Flask testing being easy isn't coincidence but **the result of 3 designs meshing — the application factory, `test_client`, and pytest fixtures.** Let me re-list this article's key points.

1. **Always set `TESTING=True`** and propagate exceptions to the test side. Forget it and a bug turns into a 500, making debugging hell.
2. **Place the `app` / `client` / `runner` fixture trio in `conftest.py`.** pytest injects fixtures by argument name, and code after `yield` handles cleanup (disposing of the temp DB, etc.).
3. Verify the request body with **`response.data` (bytes) / `response.json` / `response.text`**, `Location` with the status and header, and if needed follow with `follow_redirects` + `response.history`.
4. For sessions, **read with `with client:`, inject with `session_transaction()`.** The latter, made into a fixture for an authenticated client, becomes a login shortcut.
5. Use **`test_request_context`** for unit-testing a function that reads `request` (`before_request` isn't called / call `preprocess_request` if needed), and **`app_context`** for DB-layer tests.
6. For CLI, with **`test_cli_runner` + `monkeypatch`**, verify the output and "whether the correct processing was called."
7. **Fix the boundary's contract (happy path, validation error, authorization, 404) with tests** and keep it always green in CI. For the test DB, **distinguish by layer** the speed of SQLite and the production match of Postgres.

What divides "a working Flask app" from "a Flask app you can operate for 10 years" is not the number of features but **how much the boundary's contract is fixed with tests.** And that those tests can be written straightforwardly is itself the dividend of designing with the application factory. The tests in this article, combined with the [Playwright E2E test design guide](/blog/playwright-e2e-testing-production-design-guide) that verifies front-included flows via a browser, become a test pyramid protecting from the server alone to the user flow end to end. For the overall map of each design target, go back to the [Flask production operation guide](/blog/flask-production-guide).