Pydantic v2 Practical Guide: Protect the System Boundary with Types and Pass Only Trustworthy Data

Introduction: why you should relearn "Pydantic v2" now

The design philosophy of a robust backend can be condensed into a single sentence — "never trust data coming from outside the system boundary." HTTP request bodies, external API responses, environment variables, message-queue payloads. These are all "unvalidated data with no type guarantee," and the moment you let them pass straight inside the application, they morph into KeyError, AttributeError, and in the worst case a security hole.

Pydantic is the gatekeeper standing at this boundary. Now that FastAPI has become the de facto standard framework, the core of boundary validation in Python has nearly converged on Pydantic. But there's a problem. Many of the articles on the web, Stack Overflow, and the code generated AI outputs are still written in v1-series legacy style (@validator, .dict(), class Config).

Pydantic v2, officially released at the end of June 2023, was not a mere version up. The core validation engine was rewritten in Rust as pydantic-core and split into a separate package. Along with gaining the performance the official touts as "much faster vs v1," the API was also renewed to the model_* prefix. Write code with v1 knowledge, and it works but becomes an obsolete style and technical debt.

This article isn't a repeat of an introduction. While faithful to the official documentation (pydantic.dev/docs/validation/latest), yet one level clearer than it, it breaks through, with concrete code, the following walls you'll definitely face in practice.

"I've written with @validator, but I don't understand what changed with v2's @field_validator / @model_validator"
"The use distinction of Field()'s constraints, alias, and default_factory is vague"
"Where should I write validation spanning multiple fields, like password confirmation?"
".dict() doesn't work. What's the difference from model_dump(mode='json')?"
"strict mode and type coercion — which should I choose in production?"
"I've been reading environment variables all over with os.environ['...'], but I want to consolidate them type-safely"

The author designed and implemented the backend of a B2B SaaS that won the METI Minister's Award in Python 3.11 / Flask / SQLAlchemy 2.0 / PostgreSQL 16, and operated it in production with a strict layer separation of Router → UseCase → Repository → Model. That project's boundary validation adopted Marshmallow 3, but the discipline itself of "always validate external input at the boundary before passing inward" is completely identical to this article. In a FastAPI-based stack, Pydantic is exactly what plays that role. This article organizes the knowledge of that boundary design, together with the backing of the Pydantic v2 official documentation.

💡 This article is part of a series on Python backend design. Read the web-framework layer in FastAPI Production-Operation Guide and the persistence layer in SQLAlchemy 2.0 Practical Guide together, and you can survey consistent type-safe design from the boundary to the DB.

1. `BaseModel` and `Field`: declaratively define "the shape of correct data"

Pydantic's starting point is inheriting BaseModel. Just write type annotations on class attributes, and it becomes the single source of truth for schema, validation, and serialization.

from pydantic import BaseModel


class User(BaseModel):
    id: int
    name: str = "Jane Doe"  # デフォルト値を持つ＝省略可能フィールド


# dict から検証して生成（型強制が働き、文字列 "42" は int 42 になる）
user = User.model_validate({"id": "42"})
print(user.id)    # 42  ← int に変換されている
print(user.name)  # "Jane Doe"

Validation runs even if you call the constructor directly like User(id="42"), but for generation from external input (dict / JSON), using model_validate() / model_validate_json() is the standard play. The intent at the boundary of "it becomes a typed object only after validation" becomes clear in the code.

H3: declare constraints, aliases, and defaults with `Field()`

Type annotations alone can't express business constraints like "a positive integer" or "3–30 characters." What handles that is Field().

from typing import Annotated
from pydantic import BaseModel, Field


class Product(BaseModel):
    # Annotated パターン（v2 で推奨）：型と制約を分離して読みやすい
    name: Annotated[str, Field(min_length=1, max_length=120)]
    price: Annotated[int, Field(gt=0)]                 # 正の整数のみ
    discount_rate: Annotated[float, Field(ge=0, le=1)] # 0.0〜1.0
    sku: Annotated[str, Field(pattern=r"^[A-Z]{3}-\d{4}$")]

    # タグは「都度新しい空リスト」を生成（mutable default の罠を回避）
    tags: list[str] = Field(default_factory=list)

The main constraint parameters are, per the official documentation, as follows.

Parameter	Meaning	Applicable type
`gt` / `ge` / `lt` / `le`	Greater than / at least / less than / at most	Numbers
`min_length` / `max_length`	Min/max length	Strings, collections
`pattern`	Regex match	Strings
`default`	A static default value	All
`default_factory`	A callable that generates the default	All

⚠️ The mutable-default trap: write tags: list[str] = [] and it becomes the classic Python bug of sharing the same list object across all instances. Pydantic detects this, but always use default_factory=list / default_factory=dict for the default of a collection or dict.

H3: separate "external naming" and "internal naming" with `alias`

The external API is camelCase, and you want to unify the internal code in snake_case — a common requirement. Field(alias=...) handles this translation.

from pydantic import BaseModel, ConfigDict, Field


class ApiPayload(BaseModel):
    # 入力 JSON は "userName" だが、内部では user_name として扱いたい
    model_config = ConfigDict(populate_by_name=True)

    user_name: str = Field(alias="userName")
    is_active: bool = Field(alias="isActive")


# 外部のキャメルケースで検証
payload = ApiPayload.model_validate({"userName": "alice", "isActive": True})
print(payload.user_name)  # "alice"  ← 内部はスネークケース

alias takes effect on both validation and serialization. If you want to use different names at validation time and serialization time, specify validation_alias / serialization_alias individually. Attach populate_by_name=True and you can supply the value with either the alias or the field name, effective for backward compatibility during a migration period.

Why is this superior? So the external schema's naming convention doesn't erode the application's internal code quality, alias confines the translation layer to the boundary. Even if the external API suddenly changes user_name to userId, the only place to fix is the one line Field(alias=...). This is the practice of "ETC (Easy To Change)" in CLAUDE.md's terms, localizing the change's impact range to the boundary.

2. Validators: verify business rules that can't be expressed with types

Field()'s constraints go up to "static rules of a single field." For dynamic validation spanning multiple fields, like "normalize the email address" or "the password and confirmation password match," use validator decorators.

H3: `@field_validator`: validate/transform a single field

@field_validator receives a specific field's value and returns the validated or transformed value. In v2, combining it with @classmethod is canonical.

from pydantic import BaseModel, field_validator


class SignupForm(BaseModel):
    email: str
    age: int

    @field_validator("email", mode="after")
    @classmethod
    def normalize_email(cls, value: str) -> str:
        # mode="after"：Pydantic の内部検証後に走る。value は既に str 型が保証される
        return value.strip().lower()

    @field_validator("age", mode="after")
    @classmethod
    def must_be_adult(cls, value: int) -> int:
        if value < 18:
            raise ValueError("18歳以上である必要があります")
        return value

The use distinction of mode is the key. Let me organize it faithfully to the official definition.

`mode`	Execution timing	Value received	Main use
`"after"` (default)	After Pydantic's internal validation	A type-guaranteed value	Type-safe validation/normalization (the first choice)
`"before"`	Before internal validation/type coercion	The raw input (`Any`)	Pre-shaping the input form (e.g., wrapping a single value in a list)
`"wrap"`	Control before/after validation yourself	`Any` + `handler`	The most flexible, for exception catching, fallback, etc.

mode="before" is effective for pre-processing that shapes "the miscellaneous forms coming from a DB or form" into the proper form.

from typing import Any
from pydantic import BaseModel, field_validator


class Article(BaseModel):
    tags: list[str]

    @field_validator("tags", mode="before")
    @classmethod
    def ensure_list(cls, value: Any) -> Any:
        # "python,rust" のような単一文字列もリストとして受け入れる
        if isinstance(value, str):
            return [t.strip() for t in value.split(",")]
        return value

💡 Prefer mode="after": the official positions the after validator as "generally more type-safe." Because before has the input as Any with no type guarantee, limit it to the necessary pre-processing, and placing most of the validation logic in after is safe.

H3: `@model_validator`: validation spanning multiple fields

To validate a relationship between fields, like "the password and confirmation password match," use @model_validator. With mode="after", define it as an instance method and return the validated self.

from typing import Self
from pydantic import BaseModel, model_validator


class PasswordChange(BaseModel):
    password: str
    password_repeat: str

    @model_validator(mode="after")
    def check_passwords_match(self) -> Self:
        # この時点で password / password_repeat は型検証済み
        if self.password != self.password_repeat:
            raise ValueError("パスワードが一致しません")
        return self

On the other hand, mode="before" receives the whole raw input (dict) before the model is instantiated. It's suited to an input guard like "forbid the existence of a specific key."

from typing import Any
from pydantic import BaseModel, model_validator


class Account(BaseModel):
    username: str

    @model_validator(mode="before")
    @classmethod
    def forbid_raw_card_number(cls, data: Any) -> Any:
        # 生のクレジットカード番号が混入していたら即座に拒否する
        if isinstance(data, dict) and "card_number" in data:
            raise ValueError("card_number を直接含めることはできません")
        return data

Why is this superior? Scatter cross-field validation as hand-written ifs in the router or service layer, and the validation logic mixes into the business logic, breaking SRP (single responsibility). Consolidate it into @model_validator, and "the invariant this model represents" completes within the model definition. The guarantee that the existence of a PasswordChange instance = the passwords match is upheld at the code level, and all downstream code can trust that premise.

3. Serialization: safely return a typed object to "the outward shape"

Once you've validated and made it an object, you now need to return it to a response JSON or a DB-storage form. In Pydantic v2, model_dump() / model_dump_json() handle this (v1's .dict() / .json() are abolished).

from pydantic import BaseModel


class User(BaseModel):
    id: int
    name: str
    password: str


user = User(id=1, name="alice", password="secret")

# Python オブジェクトの dict（tuple などは Python 型のまま保持される）
user.model_dump()          # {'id': 1, 'name': 'alice', 'password': 'secret'}

# JSON 文字列（datetime → ISO 文字列など JSON 互換型へ変換される）
user.model_dump_json()     # '{"id":1,"name":"alice","password":"secret"}'

The difference between model_dump(mode='json') and mode='python' (the default) appears frequently in practice. mode='python' keeps tuple and datetime as Python types, while mode='json' converts to JSON-compatible types (lists, ISO strings, etc.). You can organize it by thinking of model_dump_json() as the latter directly turned into a JSON string.

The main control parameters are as follows.

Parameter	Effect	Typical use
`exclude={'password'}`	Exclude the specified field	Remove confidential info from the response
`include={'id', 'name'}`	Output only the specified fields	Partial exposure
`by_alias=True`	Output by alias, not the field name	Return to an external camelCase API
`exclude_none=True`	Exclude fields whose value is `None`	A sparse response
`exclude_unset=True`	Exclude fields not explicitly passed	PATCH diff updates

⚠️ Preventing confidential-info leakage: accidentally including a password hash or token in a response is a typical accident. Writing model_dump(exclude={"password"}) each time is a hotbed for misses, so the design of the later field_serializer, or splitting an outward-only response model in the first place, is robust.

H3: transform the output with `field_serializer`

To customize the output format of a specific field, use @field_serializer.

from datetime import datetime
from pydantic import BaseModel, field_serializer


class Event(BaseModel):
    name: str
    starts_at: datetime

    @field_serializer("starts_at")
    def serialize_starts_at(self, value: datetime) -> str:
        # フロントの表示規約に合わせて Unix エポック秒で返す
        return str(int(value.timestamp()))

H3: include a derived value in serialization with `computed_field`

When you want to include "a value computed from other fields" in the output, stack @computed_field with @property.

from pydantic import BaseModel, computed_field


class Box(BaseModel):
    width: float
    height: float
    depth: float

    @computed_field
    @property
    def volume(self) -> float:
        return self.width * self.height * self.depth


box = Box(width=2, height=3, depth=4)
box.model_dump()  # {'width': 2.0, 'height': 3.0, 'depth': 4.0, 'volume': 24.0}

The value declared with computed_field is included in model_dump()'s output and the JSON Schema (readOnly: True). As the official explicitly states, Pydantic applies no additional validation logic to a computed_field — it's purely a mechanism for "outputting a derived value."

Why is this superior? Compute volume on the caller side each time, and the same formula scatters across multiple places, a DRY violation. computed_field confines that knowledge to the single place called the model and exposes it consistently as a serialization result. The data and its derivation logic cohere, and the reason for change consolidates to a single point.

4. `strict` mode and type coercion: the trade-off between safety and convenience

Pydantic by default does type coercion (the lax mode). It converts the string "123" to int 123, and "true" to bool True — this is the source of the convenience. But there are situations where this "cleverness" backfires.

from pydantic import BaseModel


class Order(BaseModel):
    quantity: int


# lax（デフォルト）：文字列が黙って int に変換される
Order.model_validate({"quantity": "5"})   # quantity=5  ← 通ってしまう

For a field where type strictness directly connects to business risk, like a payment amount or inventory count, this implicit conversion is a hotbed for bugs. strict mode disables type coercion and requires an exact type match.

from pydantic import BaseModel, ConfigDict, Field


# ① 呼び出し単位で strict にする
Order.model_validate({"quantity": "5"}, strict=True)
# → ValidationError：str は int として受け付けられない

# ② フィールド単位で strict にする
class StrictOrder(BaseModel):
    quantity: int = Field(strict=True)
    note: str  # ここは lax のまま


# ③ モデル全体を strict にする
class FullyStrictOrder(BaseModel):
    model_config = ConfigDict(strict=True)
    quantity: int
    amount: int

Let me organize strict's behavior.

Input	lax (default)	strict
`{"quantity": "5"}` (string)	Converted to `5`	`ValidationError`
`{"quantity": 5}` (integer)	`5`	`5`
`{"is_active": "true"}`	Converted to `True`	`ValidationError`

💡 Where to use strict: input received from humans or loose clients at the API's outermost edge is realistically left lax, prioritizing convenience, leaving int-ification to Pydantic. On the other hand, for service-internal domain models and places where implicit conversion becomes an accident, like amount/quantity, set strict=True per field. This distinction is the practical compromise that balances convenience and safety. Note that in JSON mode, even with strict, conversion of values where "JSON has no strict type," like a date-time string, is allowed.

5. Configuration management: realize 12-factor type-safely with `pydantic-settings`

Code that reads environment variables each time with os.environ["DATABASE_URL"] carries a triple burden of the type fixed to str, no existence guarantee, and default values scattered. pydantic-settings consolidates configuration into a single typed model and auto-loads it from environment variables.

⚠️ Note it's a separate package: in v2, BaseSettings was separated from the main body into a separate package. pip install pydantic-settings is needed, and the import is from pydantic_settings import ... (from pydantic import BaseSettings is the v1 way of writing and doesn't work in v2).

from pydantic import Field
from pydantic_settings import BaseSettings, SettingsConfigDict


class Settings(BaseSettings):
    # .env を読み、APP_ プレフィックス付きの環境変数にマッピングする
    model_config = SettingsConfigDict(
        env_file=".env",
        env_prefix="APP_",
        case_sensitive=False,
    )

    database_url: str                              # APP_DATABASE_URL（必須）
    debug: bool = False                            # APP_DEBUG（型強制で "1"→True）
    allowed_hosts: list[str] = Field(default_factory=list)  # JSON としてパース
    max_connections: int = Field(default=10, gt=0)


# アプリ起動時に一度だけ生成。未設定の必須項目があればここで即座に失敗する
settings = Settings()

The points are, per the official spec, as follows.

A missing required field becomes a ValidationError at startup, so you can detect a configuration mistake before deploy rather than "noticing it for the first time in production."
debug: bool converts an environment-variable string like "1" / "true" to bool by type coercion.
Complex types like list / dict parse the environment variable as JSON (APP_ALLOWED_HOSTS='["a.com","b.com"]').
Prevent name collisions with env_prefix, and with env_nested_delimiter (e.g., __) you can express nested configuration in FOO__BAR form.

Why is this superior? By consolidating configuration into a typed model, the application can only start in a state where "the configuration is complete" (Fail Fast). settings.max_connections is statically known to be int, and a typo like settings.databse_url is detected by the type checker. This is the standard play of realizing the 12-factor App principle "store config in the environment" with both type safety and not hardcoding secrets. Secrets aren't written in code but flowed into this model via environment variables — completely consistent with CLAUDE.md's security principles too.

6. `v1` → `v2` migration: a quick-reference of the changes

Encounter an existing v1 codebase, or v1-style code that generative AI tends to output, and you can mechanically replace it with the following correspondence table. Let me organize the main renames listed in the official migration guide.

v1 (old)	v2 (new)	Category
`@validator`	`@field_validator`	Single-field validation
`@root_validator`	`@model_validator`	Whole-model / cross-field validation
`.dict()`	`.model_dump()`	Serialize to dict
`.json()`	`.model_dump_json()`	Serialize to JSON string
`.copy()`	`.model_copy()`	Duplicate an instance
`.construct()`	`.model_construct()`	Generation without validation
`.parse_obj()`	`.model_validate()`	Validate-generate from dict / object
`.parse_raw()`	`.model_validate_json()`	Validate-generate from JSON string
`class Config:`	`model_config = ConfigDict(...)`	Model configuration
`.update_forward_refs()`	`.model_rebuild()`	Resolve forward references
`__fields__`	`model_fields`	Reference field metadata
`from pydantic import BaseSettings`	`from pydantic_settings import BaseSettings`	Settings (separated into a package)
`.from_orm(obj)`	`.model_validate(obj, ...)` (`from_attributes=True`)	Generation from an ORM object

💡 The crux of migration: v2's methods consistently have the model_ prefix. This is the design decision to avoid name collisions with user-defined fields, like "the User model wants to have a business method named dict()." In the conversion @validator → @field_validator, don't forget to add @classmethod and make mode= explicit. For bulk conversion, you can also use the officially-provided migration-support tool (bump-pydantic).

H3: `TypeAdapter` that validates non-model types

A convenient mechanism that didn't exist in v1 is TypeAdapter. It can validate/serialize, on the spot, a type like list[int] or dict that doesn't warrant defining a BaseModel.

from pydantic import TypeAdapter

# list[int] を BaseModel なしで検証する
adapter = TypeAdapter(list[int])
adapter.validate_python(["1", "2", "3"])  # [1, 2, 3]  ← 各要素を型強制
adapter.validate_json("[1, 2, 3]")        # [1, 2, 3]
adapter.dump_json([1, 2, 3])              # b'[1,2,3]'  ← bytes を返す点に注意

In cases like an external API returning "an array of user objects" at the top level, where the root element is not a model, TypeAdapter(list[User]) shows its power.

Conclusion: make boundary validation "part of the type system"

Pydantic v2 is a modern validation library with the Rust-made pydantic-core at its core, deeply integrated with type annotations. Let me restate this article's key points.

With BaseModel + Field(), declaratively define "the shape of correct data," and boundary-validate with model_validate().
Consolidate business rules into @field_validator (single) / @model_validator (cross-field) and guarantee the model's invariants.
Safely control the outward shape with model_dump() / model_dump_json() / field_serializer / computed_field.
Apply strict mode to places where accidents aren't permitted, like amount/quantity, using convenience and safety by distinction.
With pydantic-settings, consolidate configuration type-safely, achieving both Fail Fast and not hardcoding secrets.
With the v1 → v2 quick-reference, mechanically migrate to the model_-prefix API, and validate non-model types too with TypeAdapter.

The difference between "code that works" and "code you can operate for 10 years" lies in the accumulation of boundary design of where and how you dam up untrustworthy data. Pydantic is the best tool to declaratively express that boundary as part of the type system.

For further exploration, I recommend re-reading the following from the official documentation, with this article's design viewpoint in mind.

Consultation on type-safe backend design

The author has implemented and operated the discipline explained here of "always validate external input at the system boundary" in the production environment of a B2B SaaS that won the METI Minister's Award (as boundary validation with Marshmallow 3). In a FastAPI-based stack, Pydantic v2 plays that role. I build, fast and at high quality leveraging generative AI, foundations directly tied to business reliability — type-safe input validation, configuration management, API schema design, and boundary defense of external integration. Feel free to consult us about backend development with Python and making existing systems type-safe.

Pydantic v2 Practical Guide: Protect the System Boundary with Types and Pass Only Trustworthy Data

Introduction: why you should relearn "Pydantic v2" now

1. `BaseModel` and `Field`: declaratively define "the shape of correct data"

H3: declare constraints, aliases, and defaults with `Field()`

H3: separate "external naming" and "internal naming" with `alias`

2. Validators: verify business rules that can't be expressed with types

H3: `@field_validator`: validate/transform a single field

H3: `@model_validator`: validation spanning multiple fields

3. Serialization: safely return a typed object to "the outward shape"

H3: transform the output with `field_serializer`

H3: include a derived value in serialization with `computed_field`

4. `strict` mode and type coercion: the trade-off between safety and convenience

5. Configuration management: realize 12-factor type-safely with `pydantic-settings`

6. `v1` → `v2` migration: a quick-reference of the changes

H3: `TypeAdapter` that validates non-model types

Conclusion: make boundary validation "part of the type system"

Consultation on type-safe backend design

PydanticAI practical guide: running a type-safe AI agent in production (structured output, tools, DI, observability)

Pydantic advanced-types / custom-validators practical guide: make reusable 'domain types' with Annotated

LLM structured output built with Pydantic: implementing JSON Schema generation, validation, and a self-healing loop with the raw API

Practical pydantic-settings guide: realize 12-factor with type-safe configuration management and secret protection

Also worth reading

FastAPI Input Validation Practical Guide: Type-Safe Query/Path/Body/Form with Annotated, Killing External Input at the Boundary

marshmallow vs Pydantic — A Thorough Comparison: Choosing by Design Philosophy, Performance, and Ecosystem (2026 Decision Guide)

Python Data Types Complete Guide: The 'Right Use' of Numbers, Strings, and Collections, and Designs That Don't Break in Production

Introduction: why you should relearn "Pydantic v2" now

1. BaseModel and Field: declaratively define "the shape of correct data"

H3: declare constraints, aliases, and defaults with Field()

H3: separate "external naming" and "internal naming" with alias

2. Validators: verify business rules that can't be expressed with types

H3: @field_validator: validate/transform a single field

H3: @model_validator: validation spanning multiple fields

3. Serialization: safely return a typed object to "the outward shape"

H3: transform the output with field_serializer

H3: include a derived value in serialization with computed_field

4. strict mode and type coercion: the trade-off between safety and convenience

5. Configuration management: realize 12-factor type-safely with pydantic-settings

6. v1 → v2 migration: a quick-reference of the changes

H3: TypeAdapter that validates non-model types

Conclusion: make boundary validation "part of the type system"

Consultation on type-safe backend design

Related articles

PydanticAI practical guide: running a type-safe AI agent in production (structured output, tools, DI, observability)

Pydantic advanced-types / custom-validators practical guide: make reusable 'domain types' with Annotated

LLM structured output built with Pydantic: implementing JSON Schema generation, validation, and a self-healing loop with the raw API

Practical pydantic-settings guide: realize 12-factor with type-safe configuration management and secret protection

Also worth reading

FastAPI Input Validation Practical Guide: Type-Safe Query/Path/Body/Form with Annotated, Killing External Input at the Boundary

marshmallow vs Pydantic — A Thorough Comparison: Choosing by Design Philosophy, Performance, and Ecosystem (2026 Decision Guide)

Python Data Types Complete Guide: The 'Right Use' of Numbers, Strings, and Collections, and Designs That Don't Break in Production

1. `BaseModel` and `Field`: declaratively define "the shape of correct data"

H3: declare constraints, aliases, and defaults with `Field()`

H3: separate "external naming" and "internal naming" with `alias`

H3: `@field_validator`: validate/transform a single field

H3: `@model_validator`: validation spanning multiple fields

H3: transform the output with `field_serializer`

H3: include a derived value in serialization with `computed_field`

4. `strict` mode and type coercion: the trade-off between safety and convenience

5. Configuration management: realize 12-factor type-safely with `pydantic-settings`

6. `v1` → `v2` migration: a quick-reference of the changes

H3: `TypeAdapter` that validates non-model types