Introduction: the difference between someone who "uses" dict and someone who "designs" mappings
Anyone who writes Python uses dict every day. But what divides evaluations in the field is not "can you use dict" but "can you understand the abstraction of 'mapping' that dict embodies, and design your own mapping type when needed."
Why does using defaultdict or Counter in aggregation code make it dramatically more readable at once? Why can configuration priority be layered "without copying" with ChainMap? Why does returning an internal dict as-is in a public API cause an accident, and why is MappingProxyType needed? And — why, when you inherit dict and override a method, does it only half take effect? All of these are things you can explain as design decisions if you understand "the protocol (abstraction) called a mapping."
This article, while basing itself on the range that Real Python's "Python Mappings", read all over the world, covers — the definition of a mapping, collections.abc's abstract base classes, the standard library's mapping family, and custom mappings — steps further into the "design knowledge that takes effect in production." Concretely,
- The exact contract of the mapping protocol (
Mapping/MutableMapping), and the "methods you get for free" by inheritance - The use distinction and pitfalls of
defaultdict/Counter/OrderedDict/ChainMap/MappingProxyType - The reason you must not directly inherit
dict, and correct custom-making viaUserDict/MutableMapping - The
__hash__/__eq__contract for making a custom object a key - Structural pattern matching, production pitfalls, and type validation at the system boundary
The author designed and implemented the backend of a B2B SaaS that won the METI Minister's Award in Python / Flask / SQLAlchemy / PostgreSQL, and operated it in production with a strict layer separation of Router → UseCase → Repository. The principles of this article, like "don't trust a dict that came from outside" and "expose internal state read-only," all come from that real combat. The basics of dict (hashability, insertion-order preservation, complexity) were handled in the previous work The Complete Guide to Python Data Types, so this article concentrates on the level above, "designing mappings."
💡 Target versions: I assume Python 3.12 / 3.13 (most of the descriptions are valid from 3.10 on). For version-dependent features, I explicitly state the introduction version.
1. What is a mapping: dict is just "one implementation"
A mapping is a collection that represents "the correspondence from a key to a value." In the Python world, dict is the most famous and fastest implementation, but dict is not the only mapping. defaultdict, Counter, ChainMap, OrderedDict, MappingProxyType, and a class you make yourself are all "mappings."
What matters here is that "being a mapping" means "satisfying a specific behavior (protocol)," not "inheriting dict." This is the Python concrete example of the design principle "depend on the abstraction, not the concrete implementation" (ETC: Easy To Change).
What gives this "behavior contract" form is collections.abc's abstract base classes (ABCs).
from collections.abc import Mapping, MutableMapping
isinstance({}, Mapping) # → True
isinstance({}, MutableMapping) # → True
from types import MappingProxyType
isinstance(MappingProxyType({}), Mapping) # → True
isinstance(MappingProxyType({}), MutableMapping) # → False ← 読み取り専用だから
The last example is the power of abstraction. MappingProxyType (read-only) is a Mapping but not a MutableMapping. So if you require Mapping in a function argument, you declare in the type "read only," and if you require MutableMapping, "rewrite."
def render(config: Mapping[str, str]) -> str:
# config を変更しないことを「型」で約束している(読み取り専用契約)
return "\n".join(f"{k}={v}" for k, v in config.items())
2. The "contract" of the mapping protocol: implement what, and what do you get for free
This is the core of Real Python's article and the key to understanding custom mappings. Inherit collections.abc's ABC, and just by implementing a few "abstract methods," you automatically get a large number of "mixin methods."
| ABC to inherit | Abstract methods you implement yourself | Mixins you get for free |
|---|---|---|
| Mapping (read-only) | __getitem__ / __iter__ / __len__ | __contains__ / keys / items / values / get / __eq__ / __ne__ |
| MutableMapping (read/write) | The above + __setitem__ / __delitem__ | The above + pop / popitem / clear / update / setdefault |
That is, inherit MutableMapping, and just by implementing a mere 5 methods (__getitem__ / __setitem__ / __delitem__ / __iter__ / __len__), a complete mapping with nearly the same API as dict is finished. Moreover, because get, update, and __contains__ all work via the 5 methods you implemented, the behavior is consistent. This is the beauty of protocol design.
💡 Why this leads to "world-class" design: mixins are the pinnacle of DRY (Don't Repeat Yourself). Write the logic of
getorupdateyourself, and you'll definitely bug it somewhere. Inherit an ABC, and they can reuse a standard implementation written correctly once. You can concentrate on just "the essence specific to this mapping (the 5 methods)." This is also the practice of SRP (single responsibility).
3. The standard library's powerful mapping family (productivity changes here)
You can write anything with dict, but choosing a mapping suited to the purpose makes the code declarative and conveys the intent at a glance. This is the practice of KISS (simplicity).
3-1. defaultdict: absorb a missing key with "automatic generation of an initial value"
The standard for aggregation and grouping. Access a non-existent key, and the default_factory creates an initial value and inserts it.
from collections import defaultdict
groups = defaultdict(list)
for name in ["apple", "avocado", "banana"]:
groups[name[0]].append(name) # キーがなければ [] を自動生成して append
# → defaultdict(<class 'list'>, {'a': ['apple', 'avocado'], 'b': ['banana']})
You can declaratively replace what you'd write with dict as groups.setdefault(name[0], []).append(name). But there's a pitfall.
counts = defaultdict(int)
_ = counts["nonexistent"] # ← 読んだだけのつもりが、キーが作られる!
print(dict(counts)) # → {'nonexistent': 0}
The bug of "I accessed it meaning to check existence, but a key proliferated as a side effect" is frequent. When you want to touch it read-only, use .get() even with defaultdict. d[key] and d.get(key) differ in meaning.
3-2. Counter: a dictionary as a multiset
A dedicated mapping that counts the number of occurrences of elements. It even has most_common() and arithmetic operations.
from collections import Counter
votes = Counter(["a", "b", "a", "c", "a", "b"])
votes.most_common(2) # → [('a', 3), ('b', 2)] 頻度順
votes["zzz"] # → 0 欠損キーは 0(KeyError を出さず、挿入もしない)
Counter("aab") + Counter("abc") # → Counter({'a': 3, 'b': 2, 'c': 1}) 加算
Counter("aab") & Counter("abc") # → Counter({'a': 1, 'b': 1}) 最小(積)
Because Counter's __missing__ returns 0, votes["zzz"], unlike defaultdict, doesn't add a key. For rankings, occurrence frequencies, inventory diffs, etc., you can sweep away hand-written counting loops.
3-3. OrderedDict: uses that remain even now that dict preserves order
"dict preserves insertion order since 3.7, so isn't OrderedDict no longer needed?" — half correct. For normal uses, dict is enough. But there are features only OrderedDict has.
from collections import OrderedDict
od = OrderedDict.fromkeys("abcd")
od.move_to_end("a") # 'a' を末尾へ(dict にはない)
od.popitem(last=False) # 先頭を取り出す → FIFO キューになる(dict は末尾固定)
# 等価性が「順序を区別する」
OrderedDict(a=1, b=2) == OrderedDict(b=2, a=1) # → False 順序まで比較
dict(a=1, b=2) == dict(b=2, a=1) # → True 順序は無視
move_to_end and popitem(last=False) are optimal for building an LRU cache (the internal idea of functools.lru_cache). In situations where "the order itself has meaning," OrderedDict is still the correct answer.
3-4. ChainMap: "layer" dictionaries without copying
Show multiple mappings as one, and search from the front in order. You can express configuration priority (CLI > environment variables > defaults) without merging (copying) the dictionaries — this takes effect.
from collections import ChainMap
defaults = {"theme": "light", "timeout": 30}
env = {"timeout": 60}
cli = {"theme": "dark"}
config = ChainMap(cli, env, defaults) # 先頭ほど優先
config["theme"] # → 'dark' (cli が勝つ)
config["timeout"] # → 60 (env が勝つ)
config["timeout"] # defaults の 30 は env に隠される
# 書き込み・削除は「先頭のマップだけ」に作用する
config["theme"] = "system"
cli # → {'theme': 'system'} ← defaults は不変のまま
Merge dictionaries with {**defaults, **env, **cli} and a memory copy occurs, and you can't swap out the original layers later. Because ChainMap can dynamically overlay while keeping the layers, it's suited to configuration management and scopes (variable nesting). Full-fledged configuration/secret management is robust combined with Configuration Management with Pydantic Settings.
3-5. MappingProxyType: safely expose internal state with a read-only view
This is knowledge directly tied to security and API design that makes a difference in the field. Return a class's or module's internal dict straight outside, and the caller can rewrite it and break the internal state. types.MappingProxyType provides a read-only view of the original dict.
from types import MappingProxyType
_internal = {"version": "1.0", "debug": False}
PUBLIC = MappingProxyType(_internal) # 読み取り専用の「窓」
PUBLIC["version"] # → '1.0'
PUBLIC["x"] = 1 # → TypeError: 'mappingproxy' object does not support item assignment
_internal["debug"] = True
PUBLIC["debug"] # → True ← コピーではなく「ビュー」。元の変更は反映される
You can achieve "internal mutable, public immutable" in 1 line. There's also the method of returning a copy with dict(_internal), but that's a snapshot — it costs a copy every time and can't track the original's updates. MappingProxyType is copy-free, always-latest, and un-rewritable — the ideal form of encapsulation. In fact, a Python class's __dict__ attribute is also exposed as this mappingproxy type.
4. Designing a custom mapping: you must not inherit dict
When the requirements don't fit a standard type (ignore case, validate on write, normalize keys…), you make your own mapping type. Here, the mistake 9 out of 10 people make is "inherit dict and override __getitem__."
4-1. Why direct inheritance of dict breaks
# アンチパターン:dict を継承して __getitem__ を上書きしても…
class UpperDict(dict):
def __getitem__(self, key):
return super().__getitem__(key.upper())
d = UpperDict()
d["ABC"] = 1
d["abc"] # → 1 (__getitem__ は確かに効く)
d.get("abc") # → None ← get() は C 実装で、あなたの __getitem__ を呼ばない!
"abc" in d # → False ← __contains__ も迂回される
dict's methods (get / update / __contains__ / ** expansion, etc.) are implemented in C, and internally don't go via your Python __getitem__. As a result, a half-broken state arises where "d["abc"] works but d.get("abc") doesn't," which is hard to debug. This is a famous Python pitfall.
4-2. The correct answer ①: inherit collections.abc.MutableMapping
Implement the 5 abstract methods, and leave get / update / __contains__, etc. all to the ABC's mixins. Because the mixins go via your 5 methods, the behavior is completely consistent.
As a concrete example, let me make a case-insensitive mapping like HTTP headers (the same idea as the requests library's CaseInsensitiveDict).
from collections.abc import MutableMapping
from typing import Iterator
class CaseInsensitiveDict(MutableMapping):
"""大文字小文字を区別しないマッピング。元のキー表記は保持する。"""
def __init__(self, data: dict | None = None) -> None:
# 小文字キー → (元のキー表記, 値) を保持する内部 dict
self._store: dict[str, tuple[str, object]] = {}
if data:
self.update(data) # MutableMapping.update が __setitem__ を呼ぶ
def __setitem__(self, key: str, value: object) -> None:
self._store[key.lower()] = (key, value)
def __getitem__(self, key: str) -> object:
return self._store[key.lower()][1]
def __delitem__(self, key: str) -> None:
del self._store[key.lower()]
def __iter__(self) -> Iterator[str]:
return (original for original, _ in self._store.values())
def __len__(self) -> int:
return len(self._store)
headers = CaseInsensitiveDict({"Content-Type": "application/json"})
headers["content-type"] # → 'application/json' __getitem__
headers.get("CONTENT-TYPE") # → 'application/json' ミックスインが __getitem__ 経由!
"content-Type" in headers # → True __contains__ も一貫
list(headers) # → ['Content-Type'] 元の表記を保持
get and in all ignore case exactly as expected. In contrast to dict inheritance's "half-broken," just implementing the abstract methods makes the whole API coherent — this is the biggest reason to use an ABC.
4-3. The correct answer ②: collections.UserDict (the dict-leaning convenient version)
If "mostly stay dict and change just part of the behavior," UserDict is handy. Because it has a real dict (self.data) inside and its methods are defined at the Python level, overrides compose correctly.
from collections import UserDict
class LoggingDict(UserDict):
"""書き込みを記録する観測可能なマッピング(可観測性の最小例)。"""
def __setitem__(self, key, value):
print(f"[audit] set {key!r}") # 実務では structlog 等で構造化ログに
super().__setitem__(key, value)
⚠️ A caution on
UserDict:UserDictdefines__contains__directly againstself.data. So when you change the meaning of the key itself, likeCaseInsensitiveDict, you need to override not just__getitem__but also__contains__. Use them by distinction:MutableMappingfor a complex custom involving key transformation,UserDictfor a light flavoring.
4-4. A read-only custom mapping
If you just want an immutable mapping, MappingProxyType (3-5) is enough in many cases. Only when logic is needed, such as wanting to lazily compute values (dynamically compute them on access), inherit collections.abc.Mapping and implement the 3 of __getitem__ / __iter__ / __len__. Because you don't implement __setitem__, it becomes structurally un-rewritable.
5. Applications that weave mappings into the data model
5-1. Make a custom object a key: the __hash__ / __eq__ contract
A mapping's key must be hashable. To make a custom class a key, you need to define __hash__ and __eq__ consistently. The iron rule is "equal objects have equal hash values." Break this, and you can't get the value out of the dictionary.
The safest standard play is @dataclass(frozen=True). It auto-generates __eq__ and __hash__ without contradiction.
from dataclasses import dataclass
@dataclass(frozen=True) # frozen=True で __hash__ が自動生成される
class GeoPoint:
lat: float
lng: float
cities = {GeoPoint(35.68, 139.69): "Tokyo"}
cities[GeoPoint(35.68, 139.69)] # → 'Tokyo' 値が等しければ同じキー
# 罠:eq だけ定義し frozen にしないと、__hash__ が None にされる
@dataclass # eq=True(既定), frozen=False(既定)
class Mutable:
x: int
{Mutable(1): "v"} # → TypeError: unhashable type: 'Mutable'
"Defining __eq__ invalidates __hash__" — this is Python's safety device that "making a key out of an object whose contents can change breaks because the hash position shifts." This is exactly why the design of "if you make it a key, make it immutable (frozen)" is correct.
5-2. Decompose a mapping with structural pattern matching (3.10+)
The match statement can be used against a mapping too, and branch on "does it have a specific key" and extract the value. You can write JSON event or command dispatch far more readably than nested if.
def handle(event: dict) -> str:
match event:
case {"type": "click", "x": int(x), "y": int(y)}:
return f"click at ({x}, {y})"
case {"type": "key", "code": str(code)}:
return f"key {code}"
case {"type": str(kind), **rest}: # 残りのキーを rest で受ける
return f"unhandled {kind} with {rest}"
case _:
return "not an event"
handle({"type": "click", "x": 10, "y": 20}) # → 'click at (10, 20)'
A mapping pattern is a partial match (it matches even with extra keys). Because you can bind while guarding by type, like int(x), it can also be used for safe decomposition of external input.
5-3. The decision to stop "stringly-typed dict"
dict is powerful, but carrying everything around as dict[str, Any] is technical debt. user["emial"] (a typo) isn't revealed until runtime, and neither IDE completion nor type checking takes effect. For "data with a fixed shape," promote it from dict to a typed model.
# アンチパターン:意味のある構造を生 dict で持つ
def total_price(order: dict) -> float:
return order["price"] * order["quantity"] # キー名はタイポし放題、型も不明
# 改善:dataclass で「形」を型にする
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class Order:
price: int # 最小通貨単位(cent)で持つ
quantity: int
def total_price(order: Order) -> int:
return order.price * order.quantity # 補完が効き、タイポは静的に落ちる
The discipline that pushes the same idea to its limit in TypeScript is summarized in TypeScript Type-Safety Discipline (Zod, NeverError, no-any). The languages differ, but the principle "give data a shape (type) and make illegitimate states inexpressible" is common.
6. Production pitfalls: accidents that actually happen with mappings
6-1. RuntimeError from modification during iteration
Change a dict's size while iterating it, and it crashes at runtime.
d = {"a": 1, "b": 2, "c": 3}
for k in d:
if d[k] == 2:
del d[k] # → RuntimeError: dictionary changed size during iteration
# 正解:反復対象のスナップショットを固定する
for k in list(d): # キーのリストを先に作る
if d[k] == 2:
del d[k]
6-2. "Sharing" a mutable dict
Placing a dict in a function's default argument, or placing a dict in a class attribute — either is shared between instances, and a change in one leaks to the whole. Receive the default with a None sentinel and generate it inside the function, and use field(default_factory=dict) for a class attribute (for details, see the mutable-default-argument section of the previous work The Complete Guide to Python Data Types).
6-3. Don't rely on the GIL for thread safety
The folk theory "because CPython has the GIL, dict operations are thread-safe" is dangerous. A single operation like d[k] = v is atomic, but a compound operation like "check then assign" is not atomic.
# 非原子的:チェックと代入の間に別スレッドが割り込みうる
if key not in counters:
counters[key] = 0
counters[key] += 1 # 読み出し → 加算 → 書き戻しも非原子的(更新が消える)
Moreover, in free-threaded CPython (PEP 703, experimentally introduced in 3.13), even the implicit protection from the GIL disappears. Protect compound operations on a shared mapping with an explicit threading.Lock — this is the portable correct answer. Reliability design for concurrency is designed together with resilience patterns like Retry, Backoff, and Circuit Breaker.
7. Most important: don't trust a dict that comes from outside (type validation at the boundary)
Let me connect the knowledge so far to production architecture. A Web API's request body, an external API's response, a config file — the result of parsing these with json.loads() is a mere dict[str, Any] with no type guarantee at all. Pass this inward without validating, and it becomes a KeyError or TypeError, and in the worst case a security accident from malformed data.
import json
raw = json.loads('{"id": 1, "name": "友田"}') # ← 型は dict[str, Any]。中身は無保証
TypedDict can express "the shape of a dict" as a static type annotation, but doesn't validate at runtime (it's just an annotation). To guarantee at runtime "is it really this shape," use Pydantic / marshmallow.
from pydantic import BaseModel, EmailStr, Field
class CreateUser(BaseModel):
id: int
name: str = Field(min_length=1, max_length=50)
email: EmailStr | None = None
# 外部の dict を検証して、信頼できる型に変える。不正なら ValidationError で堰き止める
user = CreateUser.model_validate(raw)
A mapping (dict) is the "common currency" of the system boundary. That's exactly why, at the boundary, always validate and pass only "validated types" inward — this is the first principle of a secure, unbreakable backend.
- Type-first boundary validation → Pydantic v2 Practical Guide
- ORM/framework-independent serialization/validation → marshmallow Practical Guide
- Input validation at the web-framework layer → FastAPI Production-Operation Guide and FastAPI Request Validation
- Schema-validating LLM JSON output → Validating LLM Structured Output
Summary: hold mappings as a "design tool"
Python mappings are not the single type dict, but the totality of the abstraction "key → value correspondence," its rich family of implementations, and the ability to make your own. Take away the key points as decision axes.
- A mapping is a protocol. Require
Mapping(read-only) /MutableMapping(read/write) in the type, and the contract becomes clear. - Choose the implementation suited to the purpose. Aggregation is
Counter, grouping isdefaultdict, configuration layering isChainMap, safe exposure isMappingProxyType. - Don't directly inherit
dict. For custom-making,MutableMapping(if you change the key meaning) orUserDict(light flavoring). - A type you make a key keeps the
__hash__/__eq__contract.@dataclass(frozen=True)is the safe standard play. - Don't trust a boundary dict.
TypedDictis just a static annotation, and runtime validation is done with Pydantic / marshmallow.
Anyone can just "use" dict. "Design" mappings, and the code becomes declarative, illegitimate states become inexpressible, and it becomes robust to change. The author practiced this idea in the backend of a B2B SaaS that won the METI Minister's Award, and designed the shape of data as a type at each layer of Router → UseCase → Repository. "Giving data a shape" is exactly the shortcut to eliminating bugs that slip through tests beforehand and maximizing maintainability and extensibility.
Frequently Asked Questions (FAQ)
Q. What's the difference between dict and a "mapping"?
dict is the fastest and most common one implementation of the abstraction "mapping (key → value correspondence)." defaultdict / Counter / ChainMap / MappingProxyType and custom classes are mappings too. Require collections.abc.Mapping in a function argument, and you can receive "mapping-like things" in general, not limited to dict.
Q. Should I use defaultdict or dict.setdefault()?
If you repeatedly group/aggregate, defaultdict is declarative and readable. For a one-off "initialize if absent," dict.setdefault() is enough. But because defaultdict creates a key just by accessing it, use .get() by distinction when the purpose is reading.
Q. Now that dict preserves order, isn't OrderedDict unneeded?
For normal uses, dict is enough. But if you need move_to_end() / popitem(last=False) (FIFO), or an order-distinguishing equality comparison, OrderedDict is still the correct answer. It's handy for implementing an LRU cache and the like.
Q. I want to make a custom dictionary. May I inherit dict?
Avoid it. dict's methods (get / update / in, etc.) are C implementations and don't call the __getitem__ you overrode, so the behavior only half changes and breaks. Inheriting collections.UserDict (light change) or collections.abc.MutableMapping (change the key meaning itself) is the correct answer.
Q. Can I use a dict parsed from JSON as-is?
No. The result of json.loads() is a dict[str, Any] with no type guarantee. At the boundary, runtime-validate with Pydantic / marshmallow, confirm whether id is an integer and whether the required keys exist, then pass it inward. Note that TypedDict is just a static type annotation and doesn't validate at runtime.