# marshmallow Practical Guide: Robustly Designing Python Object Serialization / Validation at the Boundary (v4-Compatible)

> Faithfully to the marshmallow official documentation (v4.3), explains from a practical standpoint: the bidirectional serialization of Schema/fields, boundary validation with load(), @validates/@validates_schema, Nested, the safe design of load_only/dump_only, marshmallow-sqlalchemy integration, the 3→4 migration, and how to choose between it and Pydantic.

- Published: 2026-06-26
- Author: 友田 陽大
- Tags: Python, marshmallow, シリアライズ, バリデーション, 型安全, Flask, SQLAlchemy, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/marshmallow-python-serialization-validation-production-guide
- Category: marshmallow

## Key points

- marshmallow declares the bidirectional conversion of 'Python object ⇄ dict/JSON' as a Schema, and dams the boundary's invalid data as a ValidationError with load()
- Use load_only / dump_only / unknown=RAISE to prevent security accidents like password leakage and mass assignment (illegal privilege escalation) with the schema's structure
- Design validation in three layers, separating responsibility: fields' validate=, single/multiple-field @validates, and inter-field @validates_schema
- Map to a domain object with @post_load, assemble composite data with fields.Nested / Pluck, and connect to the ORM type-safely with marshmallow-sqlalchemy
- marshmallow 3 → 4 has breaking changes like missing/default → load_default/dump_default; you can migrate mechanically with the cheat sheet in this article

---

## **Introduction: why dare to use marshmallow now**

The design philosophy of a robust backend can be condensed into one sentence — **"Never trust data coming from outside the system boundary."** An HTTP request body, an external API's response, form input, a message-queue payload. These are all "data with no type guarantee, unvalidated," and the moment you pass them straight through to the inside of the application, they transform into `KeyError`, `AttributeError`, and in the worst case **mass assignment (illegal privilege escalation) or data leakage.**

**marshmallow is the "bidirectional gatekeeper" standing at this boundary.** If Pydantic is "a type-first model," marshmallow has matured since 2013 as **a dedicated serialization / deserialization library independent of any ORM / framework.** In the Flask + SQLAlchemy stack, it's still the de facto standard. Its essence condenses into just 2 directions.

- **`load()` (deserialization)**: validate untrustworthy external input (dict / JSON), **normalize it**, and convert it to a form usable internally. If validation fails, dam it with a `ValidationError`.
- **`dump()` (serialization)**: shape an internal Python object (an ORM model, etc.) **into a safe outward representation (dict / JSON).**

The author has designed and implemented the backend of a B2B SaaS that won the Minister of Economy, Trade and Industry Award in **Python / Flask / SQLAlchemy / PostgreSQL**, and operated it in production with the strict layer separation `Router → UseCase → Repository → Model`. What handled that project's boundary validation was exactly **marshmallow.** This article organizes that real-combat knowledge, **faithfully to the latest marshmallow official documentation ([marshmallow.readthedocs.io](https://marshmallow.readthedocs.io/en/stable/)) yet one level more understandable.**

> 💡 **The version covered in this article**: it assumes marshmallow **4.3.0** (the stable version as of April 2026). Much of the code in web articles and that generative AI outputs is still written in the **3.x-family legacy style** (`missing=` / `default=` / `Schema.context`). This article is written in the v4 canonical style, and a **3 → 4 migration cheat sheet** is prepared at the end of the article.

> 💡 This article is part of a series on Python backend design. The type-first paired option is the [Pydantic v2 practical guide](/blog/pydantic-v2-production-validation-type-safety), the web-framework layer is the [FastAPI production operation guide](/blog/fastapi-production-async-pydantic-observability-guide), and the persistence layer is the [SQLAlchemy 2.0 practical guide](/blog/sqlalchemy-2-typed-orm-production-guide); reading them together gives a view of a consistent design from the boundary to the DB.

---

## **1. `Schema` and `fields`: declaratively define bidirectional conversion**

marshmallow's starting point is inheriting the `Schema` class. Just line up `fields` descriptors as class attributes, and that becomes the **single source of truth for validation, serialization, and deserialization.**

```python
from datetime import datetime
from marshmallow import Schema, fields


class UserSchema(Schema):
    name = fields.Str()
    email = fields.Email()
    created_at = fields.DateTime()


user = {"name": "友田", "email": "tomoda@example.com", "created_at": datetime.now()}
schema = UserSchema()

schema.dump(user)   # → dict:  {'name': '友田', 'email': 'tomoda@example.com', 'created_at': '2026-06-26T...'}
schema.dumps(user)  # → JSON 文字列:  '{"name": "友田", "email": "tomoda@example.com", ...}'
```

`dump()` returns a **Python dict**, and `dumps()` a **JSON string** (the trailing `s` is the `s` of string). A JSON-incompatible type like `datetime` is auto-converted to an ISO 8601 string by `fields.DateTime()` — this is `dump`'s job of "internal type → outward representation."

Note that instead of a class definition, you can also **generate a schema at runtime** with `Schema.from_dict()`. It's useful when you want to assemble a schema dynamically from configuration values.

```python
UserSchema = Schema.from_dict(
    {"name": fields.Str(required=True), "email": fields.Email(required=True)}
)
```

### **Handle collections with `many=True`**

For bulk conversion of multiple objects, specify `many=True`. You can specify it at schema generation or at the `dump()`/`load()` call.

```python
UserSchema(many=True).dump(users)      # ① スキーマ単位
UserSchema().dump(users, many=True)    # ② 呼び出し単位
```

**Why is this superior?**
The knowledge of "which items, in what type, with what conversion, to input/output" coheres in **one place** called the `Schema`. Because the response-shaping logic isn't scattered across view functions and the service layer, even if the output spec changes, the fix is localized to the schema definition. This is the practice of DRY and ETC (Easy To Change) as in CLAUDE.md.

---

## **2. `load()`: validate and dam the boundary's invalid data**

The inverse of `dump()` is `load()`. **Validate untrustworthy external input and return a normalized dict** — on failure, send a `ValidationError`. This is marshmallow's most important feature.

```python
from marshmallow import ValidationError

try:
    data = UserSchema().load({"name": "友田", "email": "not-an-email"})
except ValidationError as err:
    print(err.messages)    # {'email': ['Not a valid email address.']}
    print(err.valid_data)  # {'name': '友田'}  ← 検証を通った分だけ取り出せる
```

A `ValidationError` holds 2 pieces of information.

- **`err.messages`**: a dict of error messages keyed by field name. You can make it the body of a 422 response as-is.
- **`err.valid_data`**: a dict containing only the fields that passed validation. Usable for partial processing.

With `many=True`, errors return **keyed by the element's index**, so "the which-th element of the array, which field, and why it failed" is clear at a glance.

```python
# {1: {'email': ['Not a valid email address.']},
#  3: {'name': ['Missing data for required field.']}}
```

### **`required` / `allow_none` / `load_default` / `dump_default`**

Declare an input's required/optional/default with the field's arguments. **In v4, `missing=`/`default=` were removed and unified into `load_default` (the default at input) / `dump_default` (the default at output).** It's an excellent change where "the meaning differs between input and output" is clear from the name.

```python
import uuid
from datetime import datetime
from marshmallow import Schema, fields


class AccountSchema(Schema):
    # required=True：欠落したら "Missing data for required field." で落とす
    email = fields.Email(required=True, error_messages={"required": "メールアドレスは必須です。"})

    # 明示的に None を許容する（デフォルトでは None は不許可）
    nickname = fields.Str(allow_none=True)

    # load 時に値が無ければこのデフォルトを補う（呼び出し可能オブジェクトも可）
    id = fields.UUID(load_default=uuid.uuid4)

    # dump 時に値が無ければこのデフォルトで出力する
    role = fields.Str(dump_default="member")
```

### **`data_key`: separate external naming from internal naming**

External API is `camelCase`, internal code is `snake_case` — a common collision. `data_key` confines this translation to the boundary.

```python
class UserSchema(Schema):
    user_name = fields.Str(data_key="userName")
    email = fields.Email(data_key="emailAddress")

# load:  {"userName": "友田", "emailAddress": "a@b.com"} → {'user_name': '友田', 'email': 'a@b.com'}
# dump:  内部の snake_case → 外向きの camelCase へ戻る
```

### **`partial`: partial validation for PATCH**

In update-system APIs, there are scenes of "wanting to validate only the sent fields." With `partial=True` you can skip all `required`, and with `partial=("name",)` the `required` of a specific field.

```python
UserSchema().load({"email": "a@b.com"}, partial=("name",))  # name の必須を免除
```

**Why is this superior?**
The discipline of "becoming an internal type only after validation" appears in the code in the form of the function call `load()`. Data that passed the boundary is guaranteed in code to be "validated," and all downstream processing can trust that premise. It **prevents hand-written validation if statements from mixing into the business logic**, like `if not data.get("email"): ...`, protecting SRP (single responsibility).

---

## **3. The Schema as a security boundary: 3 safety devices**

marshmallow's true value is the point that **you can control "what to let inside and what not to let outside" with the schema's structure.** This is preventing the typical vulnerabilities OWASP warns about with **the structure of code** rather than operational vigilance.

### **① `dump_only`: a field you "don't let the client write"**

Receive a **value the server should decide** like `id` / `created_at` / `role` defenselessly from `request.json`, and it becomes a **mass-assignment vulnerability** where an attacker sends in `{"role": "admin"}`. A field with `dump_only=True` becomes **output-only** and is completely ignored in `load()`.

```python
class UserSchema(Schema):
    id = fields.Int(dump_only=True)              # 出力のみ。load では絶対に書き込めない
    role = fields.Str(dump_only=True)            # 権限はサーバーが決定する
    created_at = fields.DateTime(dump_only=True)
    email = fields.Email(required=True)          # これは load で受け付ける
```

### **② `load_only`: a field you "absolutely don't put in the response"**

Accidentally including a password or token in a response is a typical information-leakage accident. A field with `load_only=True` becomes **input-only** and is never included in `dump()`'s output.

```python
class SignupSchema(Schema):
    email = fields.Email(required=True)
    password = fields.Str(load_only=True, required=True, validate=validate.Length(min=12))
    # password は load では受け取るが、dump では出力されない → 漏洩を構造的に防ぐ
```

### **③ `unknown=RAISE`: reject unknown keys (the default)**

marshmallow **by default rejects unknown fields with a `ValidationError`** (`unknown=RAISE`). This is a safe-side-leaning design, the polar opposite of a defenseless expansion like `Model(**request.json)`. The behavior can be switched explicitly.

```python
from marshmallow import Schema, fields, EXCLUDE, INCLUDE, RAISE


class StrictSchema(Schema):
    class Meta:
        unknown = RAISE   # デフォルト：未知のキーはエラー（最も安全）
        # EXCLUDE → 未知のキーを黙って捨てる / INCLUDE → そのまま通す

    name = fields.Str()


# 呼び出し単位での上書きも可能
StrictSchema().load(payload, unknown=EXCLUDE)
```

| Option | Treatment of unknown keys | Main use |
| --- | --- | --- |
| `RAISE` (default) | Sends a `ValidationError` | Strict input validation (first choice) |
| `EXCLUDE` | Silently discards | When you want to ignore an external API's extra keys |
| `INCLUDE` | Passes without validation | When passing through schema-less items (caution needed) |

**Why is this superior?**
Rely on "noticing in review" for security and it will definitely leak someday. `dump_only` / `load_only` / `unknown=RAISE` make "carelessly writable / carelessly emitted" **structurally impossible** by **declaring the safety constraint in the schema definition itself.** This is the concrete implementation of CLAUDE.md's security principle "validate and sanitize all external input at the boundary, and apply least privilege."

---

## **4. Validation: the three layers of `fields` → `@validates` → `@validates_schema`**

Type validation alone can't express **business rules** like "18 or older" or "the end time is after the start time." marshmallow provides validation in **three layers according to responsibility.**

### **The first layer: the `validate=` argument (a single-field static rule)**

The most lightweight validation is the method of passing a `marshmallow.validate` validator to `validate=`.

```python
from marshmallow import Schema, fields, validate


class UserSchema(Schema):
    name = fields.Str(validate=validate.Length(min=1, max=120))
    age = fields.Int(validate=validate.Range(min=18, max=120))
    permission = fields.Str(validate=validate.OneOf(["read", "write", "admin"]))
    sku = fields.Str(validate=validate.Regexp(r"^[A-Z]{3}-\d{4}$"))
    # 複数のバリデータはリストで合成できる
    slug = fields.Str(validate=[validate.Length(min=3), validate.Regexp(r"^[a-z0-9-]+$")])
```

The main built-in validators are, per the official docs, `Length` / `Range` / `OneOf` / `Regexp` / `Email` / `Equal` / `ContainsOnly`, and so on.

> ⚠️ **A v4 breaking change**: in 3.x you could use a function that **returns `False`**, like `validate=lambda x: x == "ok"`, but **in v4 a validator must always `raise` a `ValidationError`.** The form of returning `False` stops working.

### **The second layer: `@validates` (custom validation per field)**

Logic that can't be expressed with the built-ins is written with the `@validates` decorator. **In v4 you can pass multiple field names, and the method receives `data_key`.**

```python
from marshmallow import Schema, fields, validates, ValidationError


class ItemSchema(Schema):
    quantity = fields.Integer(required=True)
    reserved = fields.Integer(required=True)

    @validates("quantity", "reserved")   # 複数フィールドを一括検証（v4）
    def validate_non_negative(self, value, data_key):
        if value < 0:
            # data_key で「どのフィールドのエラーか」を動的にメッセージへ反映できる
            raise ValidationError(f"{data_key} は 0 以上である必要があります。")
```

### **The third layer: `@validates_schema` (inter-field relationship validation)**

An **invariant spanning multiple fields**, like "the end time is after the start time" or "the discounted price is below the list price," is consolidated in `@validates_schema`.

```python
from marshmallow import Schema, fields, validates_schema, ValidationError


class BookingSchema(Schema):
    start_at = fields.DateTime(required=True)
    end_at = fields.DateTime(required=True)

    @validates_schema
    def validate_period(self, data, **kwargs):
        # ここに来る時点で start_at / end_at は型検証済み（後述の skip_on_field_errors）
        if data["start_at"] >= data["end_at"]:
            # 第2引数でエラーを特定フィールドに紐付けられる
            raise ValidationError("終了日時は開始日時より後にしてください。", "end_at")
```

When you want to assign errors to multiple fields, pass a **dict keyed by field name.**

```python
    @validates_schema
    def validate_bounds(self, data, **kwargs):
        errors = {}
        if data["field_b"] <= data["field_a"]:
            errors["field_b"] = ["field_b は field_a より大きくしてください。"]
        if errors:
            raise ValidationError(errors)
```

> 💡 **`skip_on_field_errors` defaults to `True`**: `@validates_schema` is **skipped if individual field validation has already failed** (the default since v3.0). This prevents the accident of a schema validator raising a `KeyError` even though `data["start_at"]` doesn't exist (it failed type validation). A schema-wide error with no field name is stored in the **`_schema` key** of `err.messages`.

**Why is this three-layer split superior?**
Write validation logic in "a service-layer if statement for now" and business logic and validation mix, making testing and reuse difficult. marshmallow's three layers clearly separate responsibility — **static rules in `validate=`, field-specific logic in `@validates`, and inter-field invariants in `@validates_schema`.** The guarantee "an instance of `BookingSchema` exists = the period is correct" completes inside the schema, and downstream code can trust that premise (the thoroughness of SRP).

---

## **5. Pre- and post-processing: map to the domain with `@pre_load` / `@post_load`**

Being able to insert hooks before and after validation is another of marshmallow's strengths. Processing flows in the order **`@pre_load` → field validation → `@post_load`.**

### **`@pre_load`: normalize input before validation**

**Normalization** like "strip leading/trailing whitespace" or "lowercase the email" should be done **before validation.**

```python
from marshmallow import Schema, fields, pre_load


class UserSchema(Schema):
    email = fields.Email(required=True)

    @pre_load
    def normalize(self, data, **kwargs):
        if isinstance(data.get("email"), str):
            data["email"] = data["email"].strip().lower()
        return data
```

### **`@post_load`: turn the validated dict into a domain object**

`load()` returns a dict by default, but with `@post_load` you can **convert the validated data into an instance of your domain class.** This realizes the design of "once you cross the boundary, it's no longer a dict but a typed object."

```python
from dataclasses import dataclass
from marshmallow import Schema, fields, post_load


@dataclass
class User:
    name: str
    email: str


class UserSchema(Schema):
    name = fields.Str(required=True)
    email = fields.Email(required=True)

    @post_load
    def make_user(self, data, **kwargs):
        return User(**data)


UserSchema().load({"name": "友田", "email": "a@b.com"})  # → User(name='友田', email='a@b.com')
```

### **`@post_dump`: wrap the output in an envelope**

When you want to wrap an API response in a common envelope like `{"result": ...}` / `{"results": [...]}`, use `@post_dump`. To receive `many`, **specify `pass_collection=True` in v4** (renamed from 3.x's `pass_many`).

```python
from marshmallow import Schema, post_dump


class EnvelopeSchema(Schema):
    @post_dump(pass_collection=True)
    def wrap(self, data, many, **kwargs):
        key = "results" if many else "result"
        return {key: data}
```

> 💡 **A shortcut in the latest version (4.3.0)**: in marshmallow 4.3.0, **`pre_load` / `post_load` arguments** were added to `fields.Field`, letting you declare per-field pre/post processing without a decorator. It's useful in cases where you want to shape only a specific field rather than the whole schema (for details, see the [official changelog](https://marshmallow.readthedocs.io/en/stable/changelog.html)).

---

## **6. Nesting: assemble composite data structures**

Real-world data isn't flat. With `fields.Nested`, you can **nest schemas.**

```python
class AuthorSchema(Schema):
    id = fields.Int(dump_only=True)
    name = fields.Str(required=True)


class BookSchema(Schema):
    title = fields.Str(required=True)
    author = fields.Nested(AuthorSchema)                       # 単一のネスト
    reviewers = fields.List(fields.Nested(AuthorSchema))       # ネストのコレクション
```

### **`only` / `exclude`: use only part of the nesting**

In a list API, "the author is just the name" is often enough. You can narrow only part of the nesting with `only` / `exclude`. You can also specify **multiple levels with dot notation.**

```python
class BookListSchema(Schema):
    title = fields.Str()
    author = fields.Nested(AuthorSchema(only=("name",)))       # 著者は名前だけ


class SiteSchema(Schema):
    book = fields.Nested(BookListSchema)

# 2階層下のフィールドだけを抜き出す
SiteSchema(only=("book.author.name",)).dump(site)
```

### **`fields.Pluck`: flatten the nesting into 1 attribute**

When you want "not the author object, but only an array of author names," `fields.Pluck` is the shortest.

```python
class BookSchema(Schema):
    title = fields.Str()
    author = fields.Pluck(AuthorSchema, "name")   # → {"title": "...", "author": "友田"}
```

### **Resolve circular / self-references with `lambda`**

Mutually referencing schemas (author ⇄ book) or a self-reference (employee → manager) avoid definition-order / cycle problems by **lazily evaluating with `lambda`.** There's also a way of passing the class name as a string, effective for avoiding circular imports.

```python
class UserSchema(Schema):
    name = fields.Str()
    # 自己参照：employer から先は employer を畳んで無限再帰を防ぐ
    employer = fields.Nested(lambda: UserSchema(exclude=("employer",)))
```

**Why is this superior?**
From the **single definition** of the same `AuthorSchema`, just by switching `only` / `exclude` / `Pluck` you can make a "detail view," "list view," and "embedded view." You don't need to add a model per view, avoiding definition duplication (a DRY violation). This is the core of why marshmallow is said to be strong at "presentation-layer serialization."

---

## **7. Real-combat application: connect to the ORM with `marshmallow-sqlalchemy`**

In the Flask + SQLAlchemy stack, using `marshmallow-sqlalchemy` lets you **auto-generate a schema from a SQLAlchemy model**, drastically reducing boilerplate.

```python
from marshmallow_sqlalchemy import SQLAlchemyAutoSchema, auto_field


class AuthorSchema(SQLAlchemyAutoSchema):
    class Meta:
        model = Author              # このモデルの列からフィールドを自動生成
        load_instance = True        # load() が dict ではなく Author インスタンスを返す
        include_relationships = True # リレーションも出力に含める
        include_fk = True           # 外部キー列も含める

    # 自動生成を上書きしたい列だけ auto_field で個別宣言できる
    email = auto_field(required=True)
```

`load_instance = True` is the crux. Without writing `@post_load` yourself, **`load()` returns a validated ORM instance**, so you can `session.add()` it as-is.

### **A Flask endpoint: protect the "entrance" and "exit" with one schema**

Combine the elements so far and an API endpoint becomes astonishingly concise and robust. **`load()` handles the input boundary and `dump()` the output boundary**, and validation errors are returned with 422.

```python
@app.post("/authors")
def create_author():
    schema = AuthorSchema()
    try:
        # ① 入口：未知キー拒否・型検証・業務ルール検証をすべて通過した ORM インスタンスだけが残る
        author = schema.load(request.get_json(), session=db.session)
    except ValidationError as err:
        # ② エラーは構造化されたまま 422 で返す（err.messages はフィールド名→メッセージの dict）
        return jsonify(errors=err.messages), 422

    db.session.add(author)
    db.session.commit()

    # ③ 出口：dump_only / load_only により、安全に整形された表現だけが外へ出る
    return jsonify(schema.dump(author)), 201
```

**Why is this superior?**
From the view function, all of validation, type conversion, and shaping disappears, and what remains is just the essential processing of "save." The safety of input (mass-assignment prevention) and the safety of output (confidentiality-leak prevention) are delegated to **the declaration that is the schema definition.** This is the design principle of "separate I/O, validation, and business logic by layer" itself.

---

## **8. `marshmallow 3` → `4` migration: a cheat sheet of the changes you'll hit in production**

When you encounter existing 3.x code, or the 3.x style that generative AI tends to output, you can mechanically replace it with the next correspondence table. These are the major breaking changes based on the official [upgrade guide](https://marshmallow.readthedocs.io/en/stable/upgrading.html).

| 3.x (old) | 4.x (new) | Kind |
| --- | --- | --- |
| `fields.Str(missing=...)` | `fields.Str(load_default=...)` | The default at input |
| `fields.Int(default=...)` | `fields.Int(dump_default=...)` | The default at output |
| Direct use of `fields.Number()` / `fields.Mapping()` / `fields.Field()` | `fields.Integer()` / `Float()` / `Decimal()` / `fields.Dict()` | Banning instantiation of abstract base classes |
| `validate=` **returns `False`** | **`raise`** a `ValidationError` | The validator's return value |
| `@post_dump(pass_many=True)` | `@post_dump(pass_collection=True)` | The decorator argument name |
| `class Meta: fields = (...)` / `additional` | **Explicitly declare** fields | Abolition of implicit field generation |
| `schema.context = {...}` | `contextvars.ContextVar` / `experimental.Context` | Context passing |
| Defining `@validates("name")` multiple times individually | `@validates("name", "nickname")` + the method receives `data_key` | Multiple-field support |
| `class MyField(fields.Field)` | `class MyField(fields.Field[T])` | Genericizing a custom field |
| `marshmallow.utils.from_iso_date` etc. | The standard library (`date.fromisoformat` etc.) | Removal of date utilities |
| `_bind_to_schema(self, field_name, schema)` | `_bind_to_schema(self, field_name, parent)` | The custom field's argument name |

> 💡 **The crux of migration**: the most frequent are **`missing` / `default` → `load_default` / `dump_default`** and **`pass_many` → `pass_collection`.** It's surest to mechanically surface them with `grep -rn "missing=\|default=\|pass_many=\|\.context" .`. Direct use of `fields.Number()` is also a good chance to make the intended type (integer or decimal) explicit.

---

## **9. `marshmallow` or `Pydantic`: the axis of selection**

The two are often compared, but **they're chosen by role, not as exclusive.** The starting point of design is fundamentally different.

| Aspect | marshmallow | Pydantic v2 |
| --- | --- | --- |
| Schema definition | `Schema` class + `fields` descriptors (explicit) | Type annotations + `BaseModel` (type-first) |
| Main focus | Bidirectional serialization / deserialization (presentation) | Type-driven domain model & validation |
| Speed | Pure Python implementation | Fast with the Rust `pydantic-core` |
| Ecosystem | Flask / SQLAlchemy (`marshmallow-sqlalchemy`) is mature | Integrated with FastAPI, outputs JSON Schema by default |
| Multiple views of the same data | Easy with `only` / `exclude` / `Pluck` | Tends to define a separate model per view |
| Type-checker integration | Slightly weak, being descriptor-based | Powerful, directly tied to annotations |

The **selection guideline** is clear.

- **Choose marshmallow**: you have existing **Flask / SQLAlchemy** assets, you want to flexibly make **multiple representations of the same data (list / detail / admin)**, or you want to explicitly separate serialization / validation logic from the ORM model.
- **Choose Pydantic v2**: you're building newly with **FastAPI**, you want to maximize IDE completion and static analysis **type-first**, **speed** is a requirement at high QPS, or you want to auto-generate JSON Schema.

The author adopts marshmallow in the award-winning B2B SaaS built with Flask/SQLAlchemy, and Pydantic in FastAPI-based new projects. **The discipline of "don't trust outside the boundary" is completely identical in both**, and whichever you choose, the essence doesn't change. The Pydantic-side design is detailed in the [Pydantic v2 practical guide](/blog/pydantic-v2-production-validation-type-safety).

---

## **Conclusion: elevate serialization and validation into "boundary design"**

marshmallow is a mature serialization / deserialization library independent of any ORM or framework. Let me re-list this article's key points.

1. With **`Schema` + `fields`**, declare bidirectional conversion, output with `dump()`, and **validate the boundary's input** with `load()`.
2. With **`dump_only` / `load_only` / `unknown=RAISE`**, prevent mass assignment and confidentiality leakage with **the schema's structure.**
3. Separate validation responsibility in three layers: **`validate=` (static) → `@validates` (field-specific) → `@validates_schema` (inter-field).**
4. **Normalize with `@pre_load`, turn into a domain object with `@post_load`**, and assemble composite structures with `fields.Nested` / `Pluck`.
5. Connect to the ORM with **`marshmallow-sqlalchemy`**, protecting the API's entrance and exit simultaneously with one schema.
6. The **`3 → 4` migration** centers on replacing with `load_default` / `dump_default` / `pass_collection`. Handle it mechanically with the cheat sheet.

The difference between "working code" and "code you can operate for 10 years" lies in the accumulation of boundary design — **where and how you dam untrustworthy data, and how you safely emit internal values outward.** marshmallow is a proven tool that expresses that boundary as a declarative schema.

For further exploration, I recommend re-reading the following of the official documentation with this article's design viewpoint in mind.

- [Quickstart](https://marshmallow.readthedocs.io/en/stable/quickstart.html)
- [Nesting Schemas](https://marshmallow.readthedocs.io/en/stable/nesting.html)
- [Custom Fields](https://marshmallow.readthedocs.io/en/stable/custom_fields.html)
- [Extending Schemas (pre/post processing, schema validation)](https://marshmallow.readthedocs.io/en/stable/extending/schema_validation.html)
- [Upgrading to newer releases (3→4 migration)](https://marshmallow.readthedocs.io/en/stable/upgrading.html)
- [marshmallow-sqlalchemy](https://marshmallow-sqlalchemy.readthedocs.io/en/latest/)

---

### **Consultation on type-safe backend design**

The author has implemented and operated the discipline explained here — "always validate external input at the system boundary, and safely shape and return internal values" — as boundary validation with marshmallow in the production environment of a B2B SaaS that won the Minister of Economy, Trade and Industry Award. In a FastAPI-based stack, Pydantic v2 handles that role. I build, fast and high-quality with generative AI, **the foundation directly connected to a business's reliability** — type-safe input validation, response shaping, mass-assignment countermeasures, and the boundary defense of ORM integration. On backend development using Python and the type-safe-ification of existing systems, feel free to consult me.
