# Python Data Types Complete Guide: The 'Right Use' of Numbers, Strings, and Collections, and Designs That Don't Break in Production

> Systematizing Python's built-in data types (int / float / Decimal, str, bool, None, list / tuple / dict / set) from CPython's internal structure, mutability, and complexity to production design. From float error and handling money, mutable default arguments, is vs ==, type hints, to boundary validation with Pydantic / marshmallow, explained as 'axes for deciding which to use' with practical knowledge from real projects.

- Published: 2026-06-28
- Author: 友田 陽大
- Tags: Python, 型安全, アーキテクチャ設計, パフォーマンス, Pydantic, marshmallow
- URL: https://tomodahinata.com/en/blog/python-data-types-complete-guide
- Category: Python backend
- Pillar guide: https://tomodahinata.com/en/blog/fastapi-production-async-pydantic-observability-guide

## Key points

- In Python 'everything is an object,' and a variable binds not a value but a reference—understand this one point and bugs around mutability, is vs ==, and copying can all be explained systematically
- float is IEEE 754 binary floating-point, and 0.1 + 0.2 doesn't become 0.3. Always hold money as Decimal or an integer in the smallest unit (a principle enforced on a payment foundation with zero double charges in production)
- Choose list / tuple / dict / set by 'mutable or not, ordered or not, hashable or not, and what's the complexity.' For membership tests, set / dict (O(1)), not list (O(n))
- Mutable default arguments, shared references, and shallow copies are the most common production bug sources in dynamically-typed Python. Prevent them structurally with the None sentinel and immutable types / deepcopy
- With type hints + mypy / pyright, and dataclass / Enum / TypedDict, make 'invalid states unrepresentable,' and validate system boundaries with Pydantic / marshmallow—giving static safety to a dynamic language

---

## **Introduction: a data type is not a "list to memorize" but a "vocabulary of design"**

Search "Python data types" and you usually arrive at a **list table**: `int` / `float` / `str` / `list` / `dict` … That's correct as a starting point. But what separates people on the front line is not knowing "what types exist" but **whether you hold "when, why, and which type to choose" as a decision axis.**

Why does using `float` for money calculations become a production accident? Why does an `in` test on a `list` not scale? Why does "a function given an empty list as the initial value" drag past data every time it's called? These are all bugs that can be prevented in advance **if you understand the type's "internal behavior."** Conversely, write dynamically-typed Python while leaving this vague, and it slips past tests and breaks only in production.

This article **builds on** the range covered by the world-read [Real Python "Basic Data Types in Python"](https://realpython.com/python-data-types/)—numbers, strings, booleans, None, and collections—and digs end-to-end from there into the "implementation knowledge that pays off in production." Specifically,

- CPython's **object model** (why confusing `is` and `==` hurts)
- `float`'s **IEEE 754 trap** and the right answer for handling money (`Decimal` / integer smallest unit)
- Choosing collections based on **complexity (Big-O)**
- Type hints that give **static safety** to a dynamic language, and "type design" via `dataclass` / `Enum` / `TypedDict`
- **Validation at the system boundary** (Pydantic / marshmallow)

I have designed and implemented the backend of a Minister of Economy, Trade and Industry Award-winning B2B SaaS in **Python / Flask / SQLAlchemy / PostgreSQL**, and led the payment-reliability layer on a serverless payment platform that achieved **zero double charges in production.** The principles appearing throughout this article—"don't hold money in `float`," "make invalid states unrepresentable"—all come from that real combat. I wrote it to be read not as a mere grammar translation but as **design decisions you can entrust with confidence.**

> 💡 **Target versions**: this article assumes Python **3.12 / 3.13** (most of the description is valid on 3.10+). For version-dependent features, I note the version they were introduced in.

---

## **0. The starting point of everything: in Python, "everything is an object"**

Start the data-type discussion from "the list of types" and you'll get stuck somewhere. The correct starting point is **Python's object model.** Grasp this in 5 minutes and all the later talk of mutability, copying, and `is`/`==` connects on a single line.

In Python, integers, strings, functions, and classes are all **"objects."** And every object has three attributes.

1. **Identity**: a unique ID in memory. Obtained with `id()`, unchanging while alive.
2. **Type**: what the object is. Obtained with `type()`.
3. **Value**: the contents.

What's decisively important here is that **a variable is not a box that holds a "value" but a "name tag (reference) attached to an object."** `x = 1` is not "put 1 into the box `x`" but the operation "attach the name tag `x` to the object `1`."

```python
a = [1, 2, 3]
b = a            # b は「a と同じリスト」に別の名札を貼っただけ（コピーではない）

b.append(4)
print(a)         # → [1, 2, 3, 4]   ← a も変わる！ 同じオブジェクトだから

print(a is b)    # → True           ← 同一オブジェクト（同じ id）
print(id(a) == id(b))  # → True
```

This model of "assignment is sharing a reference, not copying" is **the single biggest cause** of bugs around Python's mutable objects. Conversely, understand this and all the pitfalls described later become visible as "obvious consequences."

> 💡 **For people who write JavaScript / Java**: think of the feel as close to "everything, including primitives, is a reference type." But `int` and `str` are **immutable**, so even when shared, they can't be "rewritten and broken"—that's why they merely look safe.

---

## **1. Numeric types: `int` / `float` / `Decimal` / `Fraction` / `complex`**

### **1-1. `int`: arbitrary-precision integers that don't overflow**

Many languages' integers are fixed-width (32-bit / 64-bit) and **overflow** when they exceed the limit. Python's `int` is different. It's an **arbitrary-precision integer that grows as large as memory allows.**

```python
2 ** 100        # → 1267650600228229401496703205376  桁あふれしない
factorial = 1
for i in range(1, 101):
    factorial *= i
len(str(factorial))   # → 158  （100! は158桁）
```

This is a big advantage: "C-like `int`-overflow-derived security vulnerabilities can't occur in principle in Python." The cost is speed and memory, but it's not a problem in normal app development.

Literals can change base, and `_` can be used as a digit separator (Python 3.6+, PEP 515). Use it actively for readability.

```python
0b1010        # 2進数 → 10
0o17          # 8進数 → 15
0xFF          # 16進数 → 255
1_000_000     # → 1000000   （アンダースコアは無視される。可読性のため）
```

`int` also has bit operations (`&` `|` `^` `~` `<<` `>>`) and bit-count retrieval (`(255).bit_length()` → 8, `(7).bit_count()` → 3), holding up for flag management and low-level processing.

### **1-2. `float`: fast but "not exact"**

`float` is a **64-bit IEEE 754 double-precision floating-point number.** The CPU handles it directly so it's fast, but **it can't exactly represent decimal fractions.** This isn't a Python bug but the mathematical fact that `0.1` can't be represented in finite digits in binary.

```python
0.1 + 0.2          # → 0.30000000000000004   （0.3 ではない！）
0.1 + 0.2 == 0.3   # → False
```

This behavior that always surprises beginners **becomes a fatal bug as-is in domains where exactness is a requirement, like finance, billing, and inventory.** When comparing floats, compare with a tolerance, not strict equality (Python 3.5+, PEP 485).

```python
import math
math.isclose(0.1 + 0.2, 0.3)   # → True   （相対・絶対許容差で比較）
```

And another trap is `round()`. Python's `round()` is not the "round half up" you learned in school but **banker's rounding (round half to even).** It's the correct behavior for suppressing statistical bias, but without knowing it you get confused: "why doesn't it round up?"

```python
round(0.5)    # → 0   （0.5 は偶数の 0 へ）
round(1.5)    # → 2
round(2.5)    # → 2   （2.5 は偶数の 2 へ。3 ではない）
```

Remember the special values too. There are `float('inf')` (infinity) and `float('nan')` (not a number), and **`nan` isn't even equal to itself** (`nan == nan` is `False`)—a property that tends to break conditional branches. Always use `math.isnan()` to test for `nan`.

### **1-3. [Most important] Don't hold money in `float` — `Decimal` is the right answer**

This is the most practically relevant section in this article. **The moment you handle amounts, currency, tax rates, or billing in `float`, your system is fated to "drift."**

```python
# アンチパターン：float で金額を足し込む
total = 0.0
for _ in range(10):
    total += 0.1
print(total)        # → 0.9999999999999999   （1.0 にならない）
```

A 1-yen drift, piled up over 100,000 payments, stops adding up in accounting and becomes a breeding ground for double charges and missed refunds. The right answer is `decimal.Decimal`. **It exactly holds decimal numbers and lets you explicitly control the rounding mode.**

```python
from decimal import Decimal, ROUND_HALF_UP

# ① 必ず「文字列」から生成する（float から作ると誤差を引き継ぐ）
price = Decimal("19.99")
tax_rate = Decimal("0.10")

tax = (price * tax_rate).quantize(Decimal("0.01"), rounding=ROUND_HALF_UP)
total = price + tax
print(total)        # → 20.99   （正確）

# ② float から作ると誤差が入る — 絶対に避ける
Decimal(0.1)        # → Decimal('0.1000000000000000055511151231257827021181583404541015625')
Decimal("0.1")      # → Decimal('0.1')   ← 文字列から作るのが鉄則
```

In practice, there are two more choices at the design level.

| Method | Internal representation | Suited case | Caveat |
| --- | --- | --- | --- |
| `Decimal` | Decimal fixed-point | Forex, tax calculation, complex rounding | Be strict with type conversion on DB round-trips |
| Integer smallest unit | Yen, sen, cent as integers | High-frequency, high-performance payment processing | Unify the convention of dividing by 100 at display time across the team |

For example, the design of "always hold amounts as `int` of cents (the smallest currency unit), and convert to currency only at the moment of display" is the standard adopted by many payment systems including Stripe. On the platform where I led the payment-reliability layer, I **guarded balance updates with atomic transactions and idempotency keys, and completely eliminated `float` from amount representation**, achieving zero double charges in production. "Don't hold money in `float`" is not a slogan but **a rule written in the blood of the front line.**

> 💡 **`Fraction` as a choice**: if you want to exactly hold a rational number as "numerator/denominator," `fractions.Fraction("1/3")` works. `Fraction(1, 3) * 3 == 1` (zero error). It shines in cumulative calculations of probabilities and ratios.

### **1-4. `complex`: complex numbers for scientific computing**

Python has complex numbers **built into the language.** The imaginary unit is `j`. Used in signal processing, electrical circuits, Fourier transforms, etc.

```python
z = 3 + 4j
z.real          # → 3.0
z.imag          # → 4.0
abs(z)          # → 5.0   （複素数の絶対値＝大きさ）
```

### **1-5. `bool`: actually a subclass of `int` (a hidden trap)**

`True` / `False` are booleans, but in Python **`bool` is a subclass of `int`**, with `True == 1` and `False == 0`. This is convenient but breeds silent bugs.

```python
True + True         # → 2          （bool は int なので算術できる）
sum([True, False, True])  # → 2    （イテラブル中の True を数えるイディオム）

isinstance(True, int)     # → True  ← ここが落とし穴
```

The last line is the problem. Even if you intend `isinstance(x, int)` to "let through only integers," **`True` / `False` slip through.** When writing integer validation at a boundary, judge strictly with `type(x) is int`, or leave it to a validator like Pydantic that separates `bool` from `int`.

"Truthiness" is an important concept too. The condition of an `if` is evaluated even if it isn't `bool`, and **empty collections, `0`, `""`, and `None` are treated as falsy.**

```python
items = []
if not items:           # 空リストは偽 → Pythonic
    print("空です")

# アンチパターン: if len(items) == 0:  ← 冗長
# アンチパターン: if items == []:      ← 型に依存して脆い
```

> ⚠️ **The truthiness pitfall**: judging "was a value passed" with `if value:` mis-judges even **legitimate values** like `0`, `""`, or an empty list as "unspecified." When you want to distinguish "unspecified," use the `if value is None:` described later.

---

## **2. Strings `str`: an immutable sequence of Unicode code points**

### **2-1. The essence of str: immutable, Unicode, a sequence**

`str` is an **immutable sequence of Unicode code points.** Three keywords explain everything.

- **Immutable**: a string once made can't be changed. `s[0] = "X"` is an error. "Changing" is always **creating a new string.**
- **Unicode**: `len("こんにちは")` is 5 (the character count, not the byte count). Be careful with emoji and combining characters, but basically per code point.
- **Sequence**: indexable, sliceable, iterable.

```python
s = "Python"
s[0]            # → 'P'
s[-1]           # → 'n'
s[1:4]          # → 'yth'   （スライス。元を壊さず新しい str を返す）
s[::-1]         # → 'nohtyP' （逆順のイディオム）
len(s)          # → 6
```

String literals are varied. What to grasp in practice is the following.

```python
name = "友田"
# f-string（3.6+）：最も推奨される文字列整形
greeting = f"こんにちは、{name}さん"
# f-string の = デバッグ（3.8+）：変数名と値を同時に出す
value = 42
print(f"{value=}")          # → value=42

raw = r"C:\Users\path"      # raw 文字列：\ をエスケープとして扱わない（正規表現・パスで必須）
multi = """複数行を
そのまま書ける"""             # 三連クォート
```

### **2-2. Why you must not "concatenate in a loop with +="**

Because strings are immutable, `+=` in a loop **rebuilds a new string each time**, becoming O(n²) at worst. The right answer is `str.join()`.

```python
# アンチパターン：O(n²) になりうる
result = ""
for word in words:
    result += word          # 毎回新オブジェクト生成

# 正解：O(n)。可読性も高い
result = "".join(words)
```

> 💡 CPython has an implementation that optimizes `+=`, but **it's implementation-dependent and not guaranteed.** `join` is fast as a spec and clear in intent. "Choosing the right idiom" is a matter not only of performance but of **readability and portability.**

Organize the main methods into "search, transform, split, join, judge" and they're easy to memorize and apply. `strip()` / `lower()` / `upper()` / `replace()` / `split()` / `startswith()` / `endswith()` / `find()` / `format()`. For case-insensitive comparison, the right answer is `casefold()` (a stricter Unicode folding), not `lower()`.

### **2-3. `str` and `bytes`: the boundary of text and binary**

This is a wall you always hit when handling networks, files, or crypto.

- **`str`**: text humans read (Unicode code points).
- **`bytes`**: a raw byte sequence machines handle (an immutable sequence of 0–255). Written with `b"..."`.

Conversion between the two goes through an **explicit encoding.** `str → bytes` is `encode()`, `bytes → str` is `decode()`. 90% of mojibake is caused by **leaving the encoding implicit** here. Always specify `utf-8`.

```python
text = "日本語"
data = text.encode("utf-8")   # → b'\xe6\x97\xa5\xe6\x9c\xac\xe8\xaa\x9e'（9バイト）
data.decode("utf-8")          # → '日本語'

len(text)                     # → 3   （文字数）
len(data)                     # → 9   （バイト数。UTF-8 で日本語は1文字3バイト）
```

If you need a mutable byte sequence, use `bytearray`; to peek at a byte sequence without a memory copy, `memoryview`. In a backend handling large binaries, this distinction affects memory efficiency.

---

## **3. `None`: the sole existence representing "no value"**

`None` is a special object representing "no value," "unset," "not applicable," with type `NoneType`, and **only one of it exists in the entire program (a singleton).** That's exactly why the iron rule for comparing `None` is to **use `is`, not `==`.**

```python
result = None

if result is None:        # 正解：アイデンティティで比較
    ...

# アンチパターン：== は __eq__ をオーバーライドした型で誤動作しうる
# if result == None:
```

Functions returning `None` (no matching record in the DB, etc.) are frequent. In type hints, express "an `X`, or `None` if absent" with `X | None` (Python 3.10+; before that `Optional[X]`). This is a **contract** that "forces a `None` check on the caller," and a weapon to crush the most frequent error `NoneType has no attribute ...` with static analysis.

```python
def find_user(user_id: int) -> "User | None":
    ...

user = find_user(1)
user.name             # mypy / pyright が「None かもしれない」と警告 → 事故を防ぐ
if user is not None:
    user.name         # ここでは安全
```

> 💡 **The sentinel pattern (advanced)**: in an API where `None` itself can be a legitimate value (e.g., "the key exists but the value is `None`"), you can't use `None` as the mark of "unspecified." In that case, make a **dedicated sentinel object** like `_MISSING = object()` and distinguish with `if value is _MISSING:`. The internals of `dict.get(key, default)` use this idea too.

---

## **4. Collections: choose by mutability, order, hashability, and complexity**

This is the domain where design skill shows most. Don't choose `list` / `tuple` / `dict` / `set` "by feel"—judge by **four axes.**

1. **Mutable or immutable**: do you change the contents after making it.
2. **Does it preserve order**: does the ordering have meaning.
3. **Hashable**: can it be a `dict` key or a `set` element (= the condition is being immutable).
4. **Complexity (Big-O)**: does that operation scale.

### **4-1. `list`: a mutable dynamic array**

Ordered, mutable, allows duplicates—the most general-purpose collection. Internally a dynamic array, so **appending to the tail is fast (amortized O(1)), but inserting/removing at the head is slow (O(n)).**

```python
nums = [3, 1, 4, 1, 5]
nums.append(9)          # 末尾追加：償却 O(1)
nums.insert(0, 2)       # 先頭挿入：O(n)（全要素をずらす）
nums.sort()             # その場ソート：O(n log n)
9 in nums               # メンバーシップ判定：O(n) ← 大きいと遅い

squares = [x * x for x in range(5)]   # リスト内包表記：速くて読みやすい
```

> 💡 **If head operations are frequent, `collections.deque`**: a double-ended queue. `appendleft` / `popleft` are O(1). Code using `list.pop(0)` for a FIFO queue or sliding window gets dramatically faster just by switching to `deque`. The Real Python intro article doesn't touch it, but it's a frequent optimization on the front line.

### **4-2. `tuple`: immutable and lightweight, so it can be a "key"**

A `tuple` is like an immutable `list`, but its use is clearly different. Use it to represent **"a set of data that doesn't change / mustn't be changed."** Being immutable, it's **hashable** and can be a `dict` key or a `set` element.

```python
point = (35.6895, 139.6917)     # 緯度・経度：意味のある固定の組
# point[0] = 0  ← TypeError（不変なので安全）

# 複数戻り値は実はタプル
def divmod_(a, b):
    return a // b, a % b        # (商, 余り) というタプル
q, r = divmod_(17, 5)           # アンパック代入

cache = {(35.68, 139.69): "Tokyo"}   # タプルを dict のキーに（list ではできない）
```

When you want to "declare read-only via the type" or "have meaning as a set like coordinates or a key," choose `tuple` over `list`—this alone makes the intent clear and prevents mistaken changes before compile/run.

### **4-3. `dict`: a key-value mapping (preserves insertion order)**

`dict` is an average-O(1) mapping from key to value. **Since Python 3.7, insertion-order preservation is guaranteed as a language spec** (in 3.6 it was a CPython implementation detail). Keys must be **hashable (= immutable types).**

```python
user = {"id": 1, "name": "友田", "role": "engineer"}

user.get("email")               # キーがなければ None（KeyError を出さない）
user.get("email", "未設定")      # デフォルト付き取得
user.setdefault("tags", []).append("python")  # なければ作って操作

# 内包表記とマージ（3.9+ の | 演算子）
squared = {k: v * v for k, v in {"a": 2, "b": 3}.items()}
merged = {"a": 1} | {"b": 2}    # → {'a': 1, 'b': 2}
```

`.get()` / `.setdefault()` to avoid `KeyError`, the aggregation standard `collections.Counter`, and `collections.defaultdict` that auto-generates on a missing key make front-line `dict` operations one notch cleaner.

```python
from collections import Counter, defaultdict

Counter("mississippi")          # → Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})

groups = defaultdict(list)
for word in ["apple", "avocado", "banana"]:
    groups[word[0]].append(word)   # 'a'/'b' キーを自動生成
```

### **4-4. `set` / `frozenset`: deduplication and fast membership**

A `set` is an **unordered, non-duplicating collection whose elements are hashable.** Its biggest value is that **membership tests are O(1)** and **set operations** can be written at the language level.

```python
a = {1, 2, 3, 4}
b = {3, 4, 5, 6}

a & b           # 積集合（共通）→ {3, 4}
a | b           # 和集合 → {1, 2, 3, 4, 5, 6}
a - b           # 差集合 → {1, 2}
a ^ b           # 対称差 → {1, 2, 5, 6}

# 重複排除のイディオム
unique = list(set([1, 1, 2, 2, 3]))   # → [1, 2, 3]（順序は保証されない点に注意）

3 in a          # メンバーシップ：O(1) ← list の O(n) と決定的に違う
```

**Code that "repeats `x in some_list` on a large amount of data" turns O(n²) into O(n) just by changing `some_list` to a `set`.** This is one of the most cost-effective optimizations on the front line. `frozenset` is the immutable version and can be a `dict` key or a `set` element.

### **4-5. Complexity quick reference (this is the core of "which to use")**

The average complexity of major operations. Type selection ultimately consolidates into this table.

| Operation | list | deque | dict | set |
| --- | --- | --- | --- | --- |
| Append to tail | Amortized O(1) | O(1) | — | — |
| Append to head | O(n) | O(1) | — | — |
| Index access | O(1) | O(n) | — | — |
| Key / element search (in) | O(n) | O(n) | O(1) | O(1) |
| Get value by key | — | — | O(1) | — |
| Remove element (arbitrary) | O(n) | O(n) | O(1) | O(1) |

The decision guideline is simple. **"For ordered iteration, `list`; for both-end operations, `deque`; for lookup by key, `dict`; for existence tests and deduplication, `set`; for an unchanging set, `tuple`."**

---

## **5. The top 3 production bugs caused by mutability (this is where you differentiate)**

This is the section converting data-type knowledge into accident prevention. The bulk of dynamically-typed Python bugs consolidate into these three.

### **5-1. Mutable default arguments (Python's worst trap)**

A function's default argument is **evaluated only once at function-definition time and shared across calls.** Make a mutable object the default, and the result of the previous call leaks into the next.

```python
# アンチパターン：空リストをデフォルトに
def add_item(item, basket=[]):
    basket.append(item)
    return basket

add_item("apple")     # → ['apple']
add_item("banana")    # → ['apple', 'banana']  ← 前回の 'apple' が残る！
```

The right answer is the **`None` sentinel.** Make "create a new list per call" explicit.

```python
def add_item(item, basket=None):
    if basket is None:
        basket = []
    basket.append(item)
    return basket
```

This trap can't be prevented by type hints either—only review stops it. That's exactly why having **"mutable default → immediately None sentinel"** writable by reflex is a professional minimum.

### **5-2. Shared references and "shallow copies"**

Where Section 0's "assignment isn't a copy" bares its fangs is copying. `copy()` and slicing are **shallow copies**—they duplicate the first level, but **leave nested elements shared.**

```python
import copy

original = [[1, 2], [3, 4]]
shallow = original[:]              # 浅いコピー
shallow[0].append(99)
print(original)                   # → [[1, 2, 99], [3, 4]]  ← 内側が共有されている！

deep = copy.deepcopy(original)    # 深いコピー：再帰的に複製。完全に独立
```

In scenes like "duplicate a config dict and change only part of it" or "reuse a test fixture," this silently contaminates data. **If there's nesting, `deepcopy`, or design with immutable types (`tuple` / `frozenset`) in the first place** is safe.

### **5-3. Trying to make an unhashable type a key**

A `dict` key or a `set` element must be **hashable.** `list` / `dict` / `set` are mutable so unhashable, and `int` / `str` / `tuple` (if all contents are hashable) are hashable.

```python
{[1, 2]: "x"}              # TypeError: unhashable type: 'list'
{(1, 2): "x"}              # OK：tuple はハッシュ可能
{(1, [2]): "x"}            # TypeError：中に list を含む tuple は不可
```

Turning this property around, **"declaring a 'value that should be immutable' via the type with hashability"** is an advanced design. Hold coordinates and keys as `tuple`, and a set of constants as `frozenset`—the type itself becomes documentation that "this must not be changed."

### **Bonus: `is` and `==`, and small-integer caching**

- **`==`** compares **value** (calls `__eq__`).
- **`is`** compares **identity** (whether it's the same object).

Use `is` only for `None` / `True` / `False` / sentinels, and **never for value comparison.** The reason is CPython's "small-integer caching." Because CPython reuses `int`s from `-5` to `256`, an **implementation-dependent trap** like the following arises.

```python
a = 256; b = 256
a is b          # → True   （キャッシュされた同一オブジェクト）

a = 257; b = 257
a is b          # → False  （別オブジェクト。環境により変わる）
257 == 257      # → True   ← 値の比較は常に正しい。これを使う
```

Code relying on `a is b` being `True` breaks the moment the number exceeds 256. Mechanically keep **"equality is `==`, identity is `is`."**

---

## **6. Checking the type: `isinstance()` rather than `type()`, and duck typing**

There are two ways to check the type at runtime.

```python
type(42) is int            # 厳密に int か（サブクラスは弾く）
isinstance(42, int)        # int か、その「サブクラス」か
isinstance(x, (int, float))  # 複数候補のいずれか
```

As a principle, **use `isinstance()`.** It respects inheritance and abstract base classes, so it's more flexible and correct. But, as seen in Section 1-5, use `type(x) is int` only when you need strict judgment like "reject `bool` and let through only `int`."

Even more Pythonic is **duck typing**—the idea of judging "not what type it is, but whether it has the needed behavior." Using the abstract base classes of `collections.abc`, you can judge "is it iterable" or "is it a mapping" without binding to a concrete type.

```python
from collections.abc import Iterable, Mapping

def total(values):
    if not isinstance(values, Iterable):   # list でも set でも generator でも OK
        raise TypeError("反復可能オブジェクトが必要です")
    return sum(values)
```

This is "depend on the abstraction (protocol), not the concrete type" itself—an extension-friendly design (ETC in CLAUDE.md's terms).

---

## **7. Giving "static safety" to a dynamic language: type hints**

Python is dynamically typed, but with **type hints (PEP 484)** you can annotate types, and static analyzers like `mypy` / `pyright` **detect bugs before runtime.** Type hints aren't enforced at runtime, but in modern production Python they're **effectively mandatory.**

```python
def greet(name: str, times: int = 1) -> str:
    return f"Hello, {name}! " * times

# 3.9+ では組み込み型がそのままジェネリックに（PEP 585）
def first(items: list[int]) -> int | None:
    return items[0] if items else None

from typing import Final, Literal
MAX_RETRIES: Final = 3                          # 再代入を静的に禁止
def set_mode(mode: Literal["r", "w", "a"]) -> None: ...   # 取りうる値を型で限定
```

On my front lines, I'm thorough with the discipline of **"ban `any`-equivalents (giving up on types), fix types at the boundary, and make type checking mandatory in CI."** Even in a dynamic language, this gets you close to a "make invalid states unrepresentable" design. The practice of pushing the same philosophy to its limit in TypeScript is consolidated in [The discipline of TypeScript type safety (Zod, NeverError, no-any)](/blog/typescript-type-safety-discipline-zod-nevererror-no-any). The languages differ, but the principle **"validate at the boundary, defend the inside with types"** is completely common.

---

## **8. Beyond standard types, "design your own type"**

The difference between world-class code and ordinary code shows in **whether you "use built-in types as-is" or "design a type suited to the domain."** Stop expressing everything with `dict`, and let the intent speak through the type.

### **8-1. `dataclass`: the first choice for structured data**

`@dataclass` (3.7+) auto-generates `__init__` / `__repr__` / `__eq__`, eliminating boilerplate. **`frozen=True` makes it immutable, and `slots=True` (3.10+) improves memory efficiency and speed.**

```python
from dataclasses import dataclass, field

@dataclass(frozen=True, slots=True)
class Money:
    amount: int          # 最小通貨単位（cent）で持つ
    currency: str = "JPY"

@dataclass
class Order:
    id: int
    items: list[str] = field(default_factory=list)   # ← 可変デフォルトの正しい書き方

m = Money(1999)          # 不変なのでハッシュ可能・安全に共有できる
```

Note `field(default_factory=list)`. This is **the official practice by which dataclass correctly solves** Section 5-1's mutable-default problem.

### **8-2. `Enum`: stop "string constants"**

Hold states or kinds as raw strings, and a typo like `"acitve"` doesn't surface until runtime. With `Enum` (`StrEnum` is 3.11+), make **the possible values a closed set** and you can exclude invalid values via the type.

```python
from enum import StrEnum

class OrderStatus(StrEnum):
    PENDING = "pending"
    PAID = "paid"
    SHIPPED = "shipped"

status = OrderStatus.PAID
status == "paid"         # → True（StrEnum は str でもある）
# OrderStatus("unknown") → ValueError（不正値を即座に弾く）
```

### **8-3. `NamedTuple` / `TypedDict`: lightweight typing**

- **`NamedTuple`**: immutable, tuple-compatible, when you want to name the fields.
- **`TypedDict`**: when you want to declare "the shape of a `dict`" via the type (ideal for typing API responses).

```python
from typing import NamedTuple, TypedDict

class Point(NamedTuple):
    x: float
    y: float

class UserDict(TypedDict):
    id: int
    name: str
    email: str | None
```

Just replacing "code that passes `dict` around" with `dataclass` / `TypedDict`, IDE completion works, typos disappear, and refactoring becomes safe. This is **a direct investment in maintainability.**

---

## **9. At the system boundary, "validate" the type: Pydantic / marshmallow**

Finally, connect the knowledge so far to **production architecture.** Type hints are "a shield protecting internal code," but **they aren't enforced at runtime.** So **data coming from outside the system boundary—HTTP requests, external-API responses, environment variables, message queues—must always be validated at runtime.** "Don't trust data coming from outside"—this is the first principle of a secure backend.

What stands at this boundary is **Pydantic v2** and **marshmallow.**

```python
from pydantic import BaseModel, EmailStr, Field

class CreateUser(BaseModel):
    name: str = Field(min_length=1, max_length=50)
    email: EmailStr
    age: int = Field(ge=0, le=150)

# 不正な dict（外部入力）を渡すと ValidationError で堰き止める
user = CreateUser.model_validate({"name": "友田", "email": "a@example.com", "age": 30})
```

By now you've understood it. **Pydantic / marshmallow are devices that turn the "types" learned in this article into runtime contracts.** The range of an `int`, the length of a `str`, `None` tolerance (Optional), nested structure—declaratively validate all of them, and **let through only trustworthy data to the inside.** Keeping a dynamic language's flexibility while acquiring static-language-grade robustness at the boundary is the endpoint of modern Python backend design.

- Type-first boundary validation → [Pydantic v2 practical guide](/blog/pydantic-v2-production-validation-type-safety)
- ORM / framework-independent serialization/validation → [marshmallow practical guide](/blog/marshmallow-python-serialization-validation-production-guide)
- Input validation at the web-framework layer → [FastAPI production-operations guide](/blog/fastapi-production-async-pydantic-observability-guide) and [FastAPI request validation](/blog/fastapi-request-validation-query-path-body-parameters-guide)
- Type safety at the persistence layer → [SQLAlchemy 2.0 practical guide](/blog/sqlalchemy-2-typed-orm-production-guide)
- Designing money and idempotency → [Idempotency design to prevent double charges in payments](/blog/payment-double-charge-prevention-idempotency-procurement-guide)

---

## **Summary: design data types as "constraints"**

Python's data types are not a list to memorize but **"a vocabulary for expressing correctness, speed, and safety through the structure of code."** Take this article's points home as decision axes.

1. **Everything is an object.** A variable is a reference. So mutability, `is`/`==`, and copying behavior all connect on a single line.
2. **Don't hold money in `float`.** `Decimal` (created from a string) or an integer smallest unit. This is the front-line iron rule.
3. **Choose collections by complexity.** For existence tests and deduplication, `set` / `dict` (O(1)), not `list`.
4. **Prevent mutability bugs by reflex.** Mutable defaults get the `None` sentinel, nesting gets `deepcopy` or immutable types.
5. **Design types.** Make invalid states unrepresentable with `dataclass` / `Enum` / `TypedDict`, and validate the boundary with Pydantic / marshmallow.

It's not "dynamic typing, so it's OK to be sloppy." **Precisely because it's dynamic typing, a deep understanding of and discipline toward types separate production quality.** I have practiced this principle on Python / Flask / SQLAlchemy backends and on a payment foundation that achieved zero double charges in production. The mindset of "designing data types as constraints" is what eliminates test-evading bugs in advance and builds a change-resistant codebase.

---

## **Frequently Asked Questions (FAQ)**

### Q. How many Python data types do I need to memorize in the end?

What's frequent in practice is about 10: **numbers (`int` / `float` / `Decimal`), strings (`str` / `bytes`), boolean (`bool`), None, and collections (`list` / `tuple` / `dict` / `set`).** First grasp their "mutability, order, hashability, complexity," and you can handle 90% of situations.

### Q. How do I use `list` and `tuple` appropriately?

**If you'll change it, `list`; for a set you don't (mustn't) change, `tuple`.** A `tuple` is immutable so hashable, and can be a `dict` key or a `set` element. Coordinates, multiple return values, and fixed records are natural as `tuple`; collections you add to/remove from are natural as `list`.

### Q. Why doesn't `0.1 + 0.2` become `0.3`? Is it a bug?

It's not a bug. `float` is binary floating-point (IEEE 754) and can't exactly represent `0.1` in finite digits. Compare with `math.isclose()`, and use `Decimal` or an integer smallest unit for money calculations.

### Q. Should I use `is` or `==`?

**Whether values are equal is `==`, whether it's the same object is `is`.** Use `is` only for judging `None` / `True` / `False` / sentinels, not for comparing numbers or strings (relying on CPython's small-integer / string caching breaks).

### Q. Do type hints take effect at runtime? Is there a point in writing them?

They aren't enforced at runtime (they're ignored), but **`mypy` / `pyright` detect bugs before runtime**, so they're effectively mandatory in production code. If you want to validate external input at runtime, combine with Pydantic or marshmallow.
