# Pydantic v2 Practical Guide: Protect the System Boundary with Types and Pass Only Trustworthy Data

> Faithful to the Pydantic v2 official documentation, we explain — from a boundary-validation practical perspective — declarative models with BaseModel/Field, field_validator/model_validator, model_dump, ConfigDict and strict mode, pydantic-settings, and v1 migration.

- Published: 2026-06-24
- Author: 友田 陽大
- Tags: Python, Pydantic, 型安全, バリデーション, FastAPI, データモデリング, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/pydantic-v2-production-validation-type-safety
- Category: Pydantic & type-safe validation

## Key points

- Declare the shape of correct data with BaseModel and Field, and validate external input at the system boundary with model_validate
- Consolidate business rules into field_validator for a single field, and model_validator for invariants spanning multiple fields
- For places where implicit type coercion becomes an accident, like amount/quantity, set strict=True per field, using convenience and safety by distinction
- Consolidate configuration into a typed model with pydantic-settings, and Fail Fast on a missing required item as a ValidationError at startup
- Mechanically migrate v1's @validator / .dict() / class Config to the model_-prefix API, and validate non-model types too with TypeAdapter

---

## **Introduction: why you should relearn "Pydantic v2" now**

The design philosophy of a robust backend can be condensed into a single sentence — **"never trust data coming from outside the system boundary."** HTTP request bodies, external API responses, environment variables, message-queue payloads. These are all "unvalidated data with no type guarantee," and the moment you let them pass straight inside the application, they morph into `KeyError`, `AttributeError`, and in the worst case a security hole.

**Pydantic is the gatekeeper standing at this boundary.** Now that FastAPI has become the de facto standard framework, the core of boundary validation in Python has nearly converged on Pydantic. But there's a problem. Many of the articles on the web, Stack Overflow, and the code generated AI outputs are still written in **v1-series legacy style** (`@validator`, `.dict()`, `class Config`).

**Pydantic v2**, officially released at the end of June 2023, was not a mere version up. The core validation engine was **rewritten in Rust as `pydantic-core`** and split into a separate package. Along with gaining the performance the official touts as "much faster vs v1," the API was also renewed to the `model_*` prefix. Write code with v1 knowledge, and it works but becomes an obsolete style and technical debt.

This article isn't a repeat of an introduction. **While faithful to the official documentation ([pydantic.dev/docs/validation/latest](https://pydantic.dev/docs/validation/latest/)), yet one level clearer than it**, it breaks through, with concrete code, the following walls you'll definitely face in practice.

- "I've written with `@validator`, but I don't understand what changed with v2's `@field_validator` / `@model_validator`"
- "The use distinction of `Field()`'s constraints, `alias`, and `default_factory` is vague"
- "Where should I write **validation spanning multiple fields**, like password confirmation?"
- "`.dict()` doesn't work. What's the difference from `model_dump(mode='json')`?"
- "`strict` mode and type coercion — which should I choose in production?"
- "I've been reading environment variables all over with `os.environ['...']`, but I want to consolidate them type-safely"

The author designed and implemented the backend of a B2B SaaS that won the METI Minister's Award in **Python 3.11 / Flask / SQLAlchemy 2.0 / PostgreSQL 16**, and operated it in production with a strict layer separation of `Router → UseCase → Repository → Model`. That project's boundary validation adopted **Marshmallow 3**, but the discipline itself of "always validate external input at the boundary before passing inward" is completely identical to this article. **In a FastAPI-based stack, Pydantic is exactly what plays that role.** This article organizes the knowledge of that boundary design, together with the backing of the Pydantic v2 official documentation.

> 💡 This article is part of a series on Python backend design. Read the web-framework layer in [FastAPI Production-Operation Guide](/blog/fastapi-production-async-pydantic-observability-guide) and the persistence layer in [SQLAlchemy 2.0 Practical Guide](/blog/sqlalchemy-2-typed-orm-production-guide) together, and you can survey consistent type-safe design from the boundary to the DB.

---

## **1. `BaseModel` and `Field`: declaratively define "the shape of correct data"**

Pydantic's starting point is inheriting `BaseModel`. Just write type annotations on class attributes, and it becomes the **single source of truth for schema, validation, and serialization.**

```python
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name: str = "Jane Doe"  # デフォルト値を持つ＝省略可能フィールド


# dict から検証して生成（型強制が働き、文字列 "42" は int 42 になる）
user = User.model_validate({"id": "42"})
print(user.id)    # 42  ← int に変換されている
print(user.name)  # "Jane Doe"
```

Validation runs even if you call the constructor directly like `User(id="42")`, but **for generation from external input (dict / JSON), using `model_validate()` / `model_validate_json()` is the standard play.** The intent at the boundary of "it becomes a typed object only after validation" becomes clear in the code.

### **H3: declare constraints, aliases, and defaults with `Field()`**

Type annotations alone can't express **business constraints** like "a positive integer" or "3–30 characters." What handles that is `Field()`.

```python
from typing import Annotated
from pydantic import BaseModel, Field


class Product(BaseModel):
    # Annotated パターン（v2 で推奨）：型と制約を分離して読みやすい
    name: Annotated[str, Field(min_length=1, max_length=120)]
    price: Annotated[int, Field(gt=0)]                 # 正の整数のみ
    discount_rate: Annotated[float, Field(ge=0, le=1)] # 0.0〜1.0
    sku: Annotated[str, Field(pattern=r"^[A-Z]{3}-\d{4}$")]

    # タグは「都度新しい空リスト」を生成（mutable default の罠を回避）
    tags: list[str] = Field(default_factory=list)
```

The main constraint parameters are, per the official documentation, as follows.

| Parameter | Meaning | Applicable type |
| --- | --- | --- |
| `gt` / `ge` / `lt` / `le` | Greater than / at least / less than / at most | Numbers |
| `min_length` / `max_length` | Min/max length | Strings, collections |
| `pattern` | Regex match | Strings |
| `default` | A static default value | All |
| `default_factory` | A callable that generates the default | All |

> ⚠️ **The mutable-default trap**: write `tags: list[str] = []` and it becomes the classic Python bug of sharing the same list object across all instances. Pydantic detects this, but **always use `default_factory=list` / `default_factory=dict`** for the default of a collection or dict.

### **H3: separate "external naming" and "internal naming" with `alias`**

The external API is `camelCase`, and you want to unify the internal code in `snake_case` — a common requirement. `Field(alias=...)` handles this translation.

```python
from pydantic import BaseModel, ConfigDict, Field


class ApiPayload(BaseModel):
    # 入力 JSON は "userName" だが、内部では user_name として扱いたい
    model_config = ConfigDict(populate_by_name=True)

    user_name: str = Field(alias="userName")
    is_active: bool = Field(alias="isActive")


# 外部のキャメルケースで検証
payload = ApiPayload.model_validate({"userName": "alice", "isActive": True})
print(payload.user_name)  # "alice"  ← 内部はスネークケース
```

`alias` takes effect on both validation and serialization. If you want to use different names at validation time and serialization time, specify `validation_alias` / `serialization_alias` individually. Attach `populate_by_name=True` and you can **supply the value with either the alias or the field name**, effective for backward compatibility during a migration period.

**Why is this superior?**
So the external schema's naming convention doesn't erode the application's internal code quality, `alias` **confines the translation layer to the boundary.** Even if the external API suddenly changes `user_name` to `userId`, the only place to fix is the one line `Field(alias=...)`. This is the practice of "ETC (Easy To Change)" in CLAUDE.md's terms, localizing the change's impact range to the boundary.

---

## **2. Validators: verify business rules that can't be expressed with types**

`Field()`'s constraints go up to "static rules of a single field." For **dynamic validation spanning multiple fields**, like "normalize the email address" or "the password and confirmation password match," use validator decorators.

### **H3: `@field_validator`: validate/transform a single field**

`@field_validator` receives a specific field's value and returns the validated or transformed value. In v2, combining it with `@classmethod` is canonical.

```python
from pydantic import BaseModel, field_validator


class SignupForm(BaseModel):
    email: str
    age: int

    @field_validator("email", mode="after")
    @classmethod
    def normalize_email(cls, value: str) -> str:
        # mode="after"：Pydantic の内部検証後に走る。value は既に str 型が保証される
        return value.strip().lower()

    @field_validator("age", mode="after")
    @classmethod
    def must_be_adult(cls, value: int) -> int:
        if value < 18:
            raise ValueError("18歳以上である必要があります")
        return value
```

The use distinction of `mode` is the key. Let me organize it faithfully to the official definition.

| `mode` | Execution timing | Value received | Main use |
| --- | --- | --- | --- |
| `"after"` (default) | **After** Pydantic's internal validation | A type-guaranteed value | Type-safe validation/normalization (the first choice) |
| `"before"` | **Before** internal validation/type coercion | The raw input (`Any`) | Pre-shaping the input form (e.g., wrapping a single value in a list) |
| `"wrap"` | Control before/after validation yourself | `Any` + `handler` | The most flexible, for exception catching, fallback, etc. |

`mode="before"` is effective for pre-processing that shapes "the miscellaneous forms coming from a DB or form" into the proper form.

```python
from typing import Any
from pydantic import BaseModel, field_validator


class Article(BaseModel):
    tags: list[str]

    @field_validator("tags", mode="before")
    @classmethod
    def ensure_list(cls, value: Any) -> Any:
        # "python,rust" のような単一文字列もリストとして受け入れる
        if isinstance(value, str):
            return [t.strip() for t in value.split(",")]
        return value
```

> 💡 **Prefer `mode="after"`**: the official positions the after validator as "generally more type-safe." Because before has the input as `Any` with no type guarantee, limit it to the necessary pre-processing, and placing most of the validation logic in after is safe.

### **H3: `@model_validator`: validation spanning multiple fields**

To validate **a relationship between fields**, like "the password and confirmation password match," use `@model_validator`. With `mode="after"`, define it as an instance method and return the validated `self`.

```python
from typing import Self
from pydantic import BaseModel, model_validator


class PasswordChange(BaseModel):
    password: str
    password_repeat: str

    @model_validator(mode="after")
    def check_passwords_match(self) -> Self:
        # この時点で password / password_repeat は型検証済み
        if self.password != self.password_repeat:
            raise ValueError("パスワードが一致しません")
        return self
```

On the other hand, `mode="before"` receives the whole raw input (dict) **before** the model is instantiated. It's suited to an input guard like "forbid the existence of a specific key."

```python
from typing import Any
from pydantic import BaseModel, model_validator


class Account(BaseModel):
    username: str

    @model_validator(mode="before")
    @classmethod
    def forbid_raw_card_number(cls, data: Any) -> Any:
        # 生のクレジットカード番号が混入していたら即座に拒否する
        if isinstance(data, dict) and "card_number" in data:
            raise ValueError("card_number を直接含めることはできません")
        return data
```

**Why is this superior?**
Scatter cross-field validation as hand-written `if`s in the router or service layer, and the validation logic mixes into the business logic, breaking SRP (single responsibility). Consolidate it into `@model_validator`, and **"the invariant this model represents" completes within the model definition.** The guarantee that the existence of a `PasswordChange` instance = the passwords match is upheld at the code level, and all downstream code can trust that premise.

---

## **3. Serialization: safely return a typed object to "the outward shape"**

Once you've validated and made it an object, you now need to return it to a response JSON or a DB-storage form. In Pydantic v2, `model_dump()` / `model_dump_json()` handle this (v1's `.dict()` / `.json()` are abolished).

```python
from pydantic import BaseModel


class User(BaseModel):
    id: int
    name: str
    password: str


user = User(id=1, name="alice", password="secret")

# Python オブジェクトの dict（tuple などは Python 型のまま保持される）
user.model_dump()          # {'id': 1, 'name': 'alice', 'password': 'secret'}

# JSON 文字列（datetime → ISO 文字列など JSON 互換型へ変換される）
user.model_dump_json()     # '{"id":1,"name":"alice","password":"secret"}'
```

The difference between `model_dump(mode='json')` and `mode='python'` (the default) appears frequently in practice. **`mode='python'` keeps `tuple` and `datetime` as Python types**, while **`mode='json'` converts to JSON-compatible types (lists, ISO strings, etc.).** You can organize it by thinking of `model_dump_json()` as the latter directly turned into a JSON string.

The main control parameters are as follows.

| Parameter | Effect | Typical use |
| --- | --- | --- |
| `exclude={'password'}` | Exclude the specified field | Remove confidential info from the response |
| `include={'id', 'name'}` | Output only the specified fields | Partial exposure |
| `by_alias=True` | Output by alias, not the field name | Return to an external camelCase API |
| `exclude_none=True` | Exclude fields whose value is `None` | A sparse response |
| `exclude_unset=True` | Exclude fields not explicitly passed | PATCH diff updates |

> ⚠️ **Preventing confidential-info leakage**: accidentally including a password hash or token in a response is a typical accident. Writing `model_dump(exclude={"password"})` each time is a hotbed for misses, so the design of the later `field_serializer`, or splitting an outward-only response model in the first place, is robust.

### **H3: transform the output with `field_serializer`**

To customize the output format of a specific field, use `@field_serializer`.

```python
from datetime import datetime
from pydantic import BaseModel, field_serializer


class Event(BaseModel):
    name: str
    starts_at: datetime

    @field_serializer("starts_at")
    def serialize_starts_at(self, value: datetime) -> str:
        # フロントの表示規約に合わせて Unix エポック秒で返す
        return str(int(value.timestamp()))
```

### **H3: include a derived value in serialization with `computed_field`**

When you want to include "a value computed from other fields" in the output, stack `@computed_field` with `@property`.

```python
from pydantic import BaseModel, computed_field


class Box(BaseModel):
    width: float
    height: float
    depth: float

    @computed_field
    @property
    def volume(self) -> float:
        return self.width * self.height * self.depth


box = Box(width=2, height=3, depth=4)
box.model_dump()  # {'width': 2.0, 'height': 3.0, 'depth': 4.0, 'volume': 24.0}
```

The value declared with `computed_field` is included in `model_dump()`'s output and the JSON Schema (`readOnly: True`). As the official explicitly states, **Pydantic applies no additional validation logic to a computed_field** — it's purely a mechanism for "outputting a derived value."

**Why is this superior?**
Compute `volume` on the caller side each time, and the same formula scatters across multiple places, a DRY violation. `computed_field` confines that knowledge to **the single place called the model** and exposes it consistently as a serialization result. The data and its derivation logic cohere, and the reason for change consolidates to a single point.

---

## **4. `strict` mode and type coercion: the trade-off between safety and convenience**

Pydantic by default does **type coercion (the lax mode).** It converts the string `"123"` to `int` `123`, and `"true"` to `bool` `True` — this is the source of the convenience. But there are situations where this "cleverness" backfires.

```python
from pydantic import BaseModel


class Order(BaseModel):
    quantity: int


# lax（デフォルト）：文字列が黙って int に変換される
Order.model_validate({"quantity": "5"})   # quantity=5  ← 通ってしまう
```

For a field where **type strictness directly connects to business risk**, like a payment amount or inventory count, this implicit conversion is a hotbed for bugs. `strict` mode disables type coercion and **requires an exact type match.**

```python
from pydantic import BaseModel, ConfigDict, Field


# ① 呼び出し単位で strict にする
Order.model_validate({"quantity": "5"}, strict=True)
# → ValidationError：str は int として受け付けられない

# ② フィールド単位で strict にする
class StrictOrder(BaseModel):
    quantity: int = Field(strict=True)
    note: str  # ここは lax のまま


# ③ モデル全体を strict にする
class FullyStrictOrder(BaseModel):
    model_config = ConfigDict(strict=True)
    quantity: int
    amount: int
```

Let me organize strict's behavior.

| Input | lax (default) | strict |
| --- | --- | --- |
| `{"quantity": "5"}` (string) | Converted to `5` | `ValidationError` |
| `{"quantity": 5}` (integer) | `5` | `5` |
| `{"is_active": "true"}` | Converted to `True` | `ValidationError` |

> 💡 **Where to use strict**: input received from humans or loose clients at the API's outermost edge is realistically left lax, prioritizing convenience, leaving `int`-ification to Pydantic. On the other hand, **for service-internal domain models and places where implicit conversion becomes an accident, like amount/quantity, set `strict=True` per field.** This distinction is the practical compromise that balances convenience and safety. Note that in JSON mode, even with strict, conversion of values where "JSON has no strict type," like a date-time string, is allowed.

---

## **5. Configuration management: realize 12-factor type-safely with `pydantic-settings`**

Code that reads environment variables each time with `os.environ["DATABASE_URL"]` carries a triple burden of **the type fixed to `str`, no existence guarantee, and default values scattered.** `pydantic-settings` consolidates configuration into **a single typed model** and auto-loads it from environment variables.

> ⚠️ **Note it's a separate package**: in v2, `BaseSettings` was separated from the main body into a separate package. `pip install pydantic-settings` is needed, and the import is `from pydantic_settings import ...` (`from pydantic import BaseSettings` is the v1 way of writing and doesn't work in v2).

```python
from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    # .env を読み、APP_ プレフィックス付きの環境変数にマッピングする
    model_config = SettingsConfigDict(
        env_file=".env",
        env_prefix="APP_",
        case_sensitive=False,
    )

    database_url: str                              # APP_DATABASE_URL（必須）
    debug: bool = False                            # APP_DEBUG（型強制で "1"→True）
    allowed_hosts: list[str] = Field(default_factory=list)  # JSON としてパース
    max_connections: int = Field(default=10, gt=0)


# アプリ起動時に一度だけ生成。未設定の必須項目があればここで即座に失敗する
settings = Settings()
```

The points are, per the official spec, as follows.

- **A missing required field becomes a `ValidationError` at startup**, so you can **detect a configuration mistake before deploy** rather than "noticing it for the first time in production."
- `debug: bool` converts an environment-variable string like `"1"` / `"true"` to `bool` by type coercion.
- Complex types like `list` / `dict` **parse the environment variable as JSON** (`APP_ALLOWED_HOSTS='["a.com","b.com"]'`).
- Prevent name collisions with `env_prefix`, and with `env_nested_delimiter` (e.g., `__`) you can express nested configuration in `FOO__BAR` form.

**Why is this superior?**
By consolidating configuration into a typed model, the application can only start in a state where "the configuration is complete" (Fail Fast). `settings.max_connections` is statically known to be `int`, and a typo like `settings.databse_url` is detected by the type checker. This is the standard play of realizing the 12-factor App principle "store config in the environment" with **both type safety and not hardcoding secrets.** Secrets aren't written in code but flowed into this model via environment variables — completely consistent with CLAUDE.md's security principles too.

---

## **6. `v1` → `v2` migration: a quick-reference of the changes**

Encounter an existing v1 codebase, or v1-style code that generative AI tends to output, and you can mechanically replace it with the following correspondence table. Let me organize the main renames listed in the official migration guide.

| v1 (old) | v2 (new) | Category |
| --- | --- | --- |
| `@validator` | `@field_validator` | Single-field validation |
| `@root_validator` | `@model_validator` | Whole-model / cross-field validation |
| `.dict()` | `.model_dump()` | Serialize to dict |
| `.json()` | `.model_dump_json()` | Serialize to JSON string |
| `.copy()` | `.model_copy()` | Duplicate an instance |
| `.construct()` | `.model_construct()` | Generation without validation |
| `.parse_obj()` | `.model_validate()` | Validate-generate from dict / object |
| `.parse_raw()` | `.model_validate_json()` | Validate-generate from JSON string |
| `class Config:` | `model_config = ConfigDict(...)` | Model configuration |
| `.update_forward_refs()` | `.model_rebuild()` | Resolve forward references |
| `__fields__` | `model_fields` | Reference field metadata |
| `from pydantic import BaseSettings` | `from pydantic_settings import BaseSettings` | Settings (separated into a package) |
| `.from_orm(obj)` | `.model_validate(obj, ...)` (`from_attributes=True`) | Generation from an ORM object |

> 💡 **The crux of migration**: v2's methods consistently have the **`model_` prefix.** This is the design decision to **avoid name collisions with user-defined fields**, like "the `User` model wants to have a business method named `dict()`." In the conversion `@validator` → `@field_validator`, don't forget to add `@classmethod` and make `mode=` explicit. For bulk conversion, you can also use the officially-provided migration-support tool (`bump-pydantic`).

### **H3: `TypeAdapter` that validates non-model types**

A convenient mechanism that didn't exist in v1 is `TypeAdapter`. It can validate/serialize, on the spot, a type like `list[int]` or `dict` that doesn't warrant defining a `BaseModel`.

```python
from pydantic import TypeAdapter

# list[int] を BaseModel なしで検証する
adapter = TypeAdapter(list[int])
adapter.validate_python(["1", "2", "3"])  # [1, 2, 3]  ← 各要素を型強制
adapter.validate_json("[1, 2, 3]")        # [1, 2, 3]
adapter.dump_json([1, 2, 3])              # b'[1,2,3]'  ← bytes を返す点に注意
```

In cases like an external API returning "an array of user objects" at the top level, where the root element is not a model, `TypeAdapter(list[User])` shows its power.

---

## **Conclusion: make boundary validation "part of the type system"**

Pydantic v2 is a modern validation library with the Rust-made `pydantic-core` at its core, deeply integrated with type annotations. Let me restate this article's key points.

1. With **`BaseModel` + `Field()`**, declaratively define "the shape of correct data," and boundary-validate with `model_validate()`.
2. Consolidate business rules into **`@field_validator` (single) / `@model_validator` (cross-field)** and guarantee the model's invariants.
3. Safely control the outward shape with **`model_dump()` / `model_dump_json()` / `field_serializer` / `computed_field`.**
4. Apply **`strict` mode** to places where accidents aren't permitted, like amount/quantity, using convenience and safety by distinction.
5. With **`pydantic-settings`**, consolidate configuration type-safely, achieving both Fail Fast and not hardcoding secrets.
6. With the **`v1 → v2` quick-reference**, mechanically migrate to the `model_`-prefix API, and validate non-model types too with `TypeAdapter`.

The difference between "code that works" and "code you can operate for 10 years" lies in the accumulation of boundary design of **where and how you dam up untrustworthy data.** Pydantic is the best tool to declaratively express that boundary as part of the type system.

For further exploration, I recommend re-reading the following from the official documentation, with this article's design viewpoint in mind.

- [Models](https://pydantic.dev/docs/validation/latest/concepts/models/)
- [Fields](https://pydantic.dev/docs/validation/latest/concepts/fields/)
- [Validators](https://pydantic.dev/docs/validation/latest/concepts/validators/)
- [Serialization](https://pydantic.dev/docs/validation/latest/concepts/serialization/)
- [Configuration](https://pydantic.dev/docs/validation/latest/concepts/config/)
- [Strict Mode](https://pydantic.dev/docs/validation/latest/concepts/strict_mode/)
- [Settings Management](https://pydantic.dev/docs/validation/latest/concepts/pydantic_settings/)
- [Migration Guide](https://pydantic.dev/docs/validation/latest/get-started/migration/)

---

### **Consultation on type-safe backend design**

The author has implemented and operated the discipline explained here of "always validate external input at the system boundary" in the production environment of a B2B SaaS that won the METI Minister's Award (as boundary validation with Marshmallow 3). In a FastAPI-based stack, Pydantic v2 plays that role. I build, fast and at high quality leveraging generative AI, foundations directly tied to business reliability — type-safe input validation, configuration management, API schema design, and boundary defense of external integration. Feel free to consult us about backend development with Python and making existing systems type-safe.
