Skip to main content
友田 陽大
DynamoDB
AWS
DynamoDB
アーキテクチャ設計
冪等性
サーバーレス
TypeScript
可観測性

DynamoDB Single-Table Design & Production Reliability Patterns — The Complete Guide (2026 Edition): Idempotency, Conditional Writes, and Transactions in Real Code

We explain DynamoDB single-table design — from access-pattern-driven key design (PK/SK, GSI overloading) through idempotency, conditional writes, atomic balance updates, TransactWriteItems, and consistency — in real AWS SDK v3 TypeScript code faithful to the AWS official specs.

Published
Reading time
17 min read
Author
友田 陽大
Share

DynamoDB is not "a fast, never-down KVS." It is a powerful set of primitives for guaranteeing correctness through the structure of your code. Combine conditional writes, atomic increment/decrement, and transactions correctly, and you can seal off double execution, broken balances, and data inconsistency "by design" — without relying on application-side locks or retry runbooks.

This article is a reference / pattern collection based on my experience designing and leading the reliability layer of a serverless multi-tenant payment platform (4 Python serverless backends + DynamoDB, owning about 60% of the repository = 403/694 commits) and maintaining 0 double charges in production. The concrete story of the payment platform is in a separate article, "Designing 'Zero Double Charges' on a Serverless Payment Platform." Complementary to that, this article systematizes only the reusable design patterns and real AWS SDK v3 TypeScript code.

All specs and limits are cross-checked against the AWS official documentation (as of June 2026).

Note: The code is an excerpt of the key points to convey design intent. Table names and attribute names are abstracted.

1. Why single-table design (breaking free from RDB thinking)

The starting point is the access pattern, not the table

In a relational DB, you first build a normalized data model and add queries later. In DynamoDB the order is reversed. The AWS official clearly states this:

You should not begin designing your DynamoDB schema until you know the questions it needs to answer. Understanding the business problems and the application use cases up front is essential.

DynamoDB is an engine that queries "extremely efficiently in a limited number of ways, and expensively and slowly in others." So the first step of design is not to carve out tables but to enumerate the query patterns the system must satisfy.

The three properties to grasp at design time (official):

PropertyMeaning
Data sizeThe amount of data stored/retrieved at once → determines how partitioning works
Data shapeDon't shape it at query time — store it in the shape it's queried
Data velocityThe query load at peak → distributed design to use up the I/O capacity

"Keep tables few" is the principle

The official general principle is "you should keep as few tables as possible in a DynamoDB application." The reason is clear: fewer tables scale more easily, simplify permission management, and lower overhead and backup cost. "Keeping related data in one place (locality of reference)" is said to be the key that determines NoSQL's response performance.

In other words, instead of splitting user, order, and payment into separate tables, single-table design houses different kinds of items together in one table. Example:

PKSKRole (entity)
USER#u_123PROFILEUser profile
USER#u_123ORDER#o_900That user's order
ORDER#o_900PAYMENT#p_555The payment tied to that order
ORDER#o_900IDEM#charge#k_abcIdempotency marker (with TTL)

The set of items sharing the same PK (an item collection) is stored physically adjacent, so a single Query (specifying the PK) retrieves "a given user's profile + all orders" together. This achieves the equivalent of an RDB JOIN without joining.

When you should not use single-table (YAGNI)

Single-table design is not a cure-all. Cases where it's better avoided:

  • An app at the stage where access patterns aren't fixed / you make heavy use of exploratory, ad-hoc queries. DynamoDB is bad at anything other than predetermined patterns, and each new query needs a GSI addition or data redesign. While requirements are fluid, an RDB (like PostgreSQL) iterates overwhelmingly faster.
  • A small-scale app where the operational burden of multiple tables isn't a problem. It's not worth single-table's key-design cost.
  • When complex aggregation/analytical queries (GROUP BY, window functions, etc.) are central. The official also states that "high-volume time-series data, or datasets with completely different access patterns" are exceptions.

I myself lean the parts of the payment "core" that need consistency onto DynamoDB, while using other options for areas that need exploratory queries. Single-table is not a goal but a means to satisfy fixed access patterns at minimum cost and maximum consistency.


2. Key design: PK/SK, GSI overloading, sparse indexes

Composite primary key (PK + SK)

The PK (partition key) decides the data's physical placement; the SK (sort key) decides the ordering within the same partition. Because you can use begins_with or range conditions on the SK, queries like "narrow to orders only" with the ORDER# prefix are cheap.

Official design principles:

  • Keep related data together — gather related items under the same PK.
  • Use sort order — create sort order in the key design to make range queries efficient.
  • Distribute queries — design keys so access doesn't concentrate on one partition (a hot partition), distributing load evenly.

GSI overloading

Because a DynamoDB table "may have different types per row," a single GSI can serve multiple access patterns. This is GSI overloading. Assign generically named attributes like GSI1PK / GSI1SK to the GSI's keys, and put values whose meaning changes per item type.

Example (making order search multi-purpose with one GSI):

EntityGSI1PKGSI1SK
Order (for status search)STATUS#paid2026-06-24T10:00:00Z
Payment (for per-customer search)CUSTOMER#u_123PAYMENT#2026-06-24

According to the official, the GSI per table is a default quota of 20, but with overloading you can "effectively index far more than 20 data fields." Since increasing the number of GSIs consumes cost (propagated to each GSI on write) and quota, serving multiple uses with generic keys is the standard play.

Sparse indexes

A GSI indexes "only items that have that GSI's key attributes." A sparse index exploits this. For example, to pull "only unprocessed payments," attach a GSI_UNPROCESSED_PK attribute only to unprocessed items, and delete that attribute on completion. Then only unprocessed ones remain in the GSI, and you can fetch the targets instantly without a full scan.

// 「未処理」マーカーを持つ間だけ GSI に載る(スパースインデックス)
// 処理完了時に REMOVE すると GSI から自動的に消える
const markProcessed = new UpdateCommand({
  TableName: TABLE,
  Key: { PK: `ORDER#${orderId}`, SK: `PAYMENT#${paymentId}` },
  UpdateExpression: "REMOVE GSI_UNPROCESSED_PK SET #st = :done",
  ExpressionAttributeNames: { "#st": "status" },
  ExpressionAttributeValues: { ":done": "processed" },
});

3. Idempotency (exactly-once): "create exactly once" with attribute_not_exists

Why it becomes a problem

Mobile-line timeouts, API Gateway retries, Lambda re-executions — in a distributed system, "resends of the same request" always happen. PutItem by default overwrites the item with the same primary key unconditionally, so written naively, a second payment creation goes through.

The solution: conditional PutItem

Use the official attribute_not_exists() to guarantee "create only when it doesn't yet exist." For a composite key, as the official note says, attribute_not_exists(PK) evaluates "both PK and SK don't exist (= the item itself is absent)" as a single condition.

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import {
  DynamoDBDocumentClient,
  PutCommand,
} from "@aws-sdk/lib-dynamodb";
import { ConditionalCheckFailedException } from "@aws-sdk/client-dynamodb";

const ddb = DynamoDBDocumentClient.from(new DynamoDBClient({}), {
  marshallOptions: { removeUndefinedValues: true },
});
const TABLE = process.env.TABLE_NAME!;

/**
 * 冪等な「決済の作成」。同じ idempotencyKey での2回目以降は
 * ConditionalCheckFailedException となり、課金は1回に収束する。
 */
export async function createChargeOnce(input: {
  orderId: string;
  paymentId: string;
  idempotencyKey: string;
  amount: number;
}): Promise<{ created: boolean }> {
  try {
    await ddb.send(
      new PutCommand({
        TableName: TABLE,
        Item: {
          PK: `ORDER#${input.orderId}`,
          SK: `PAYMENT#${input.paymentId}`,
          idempotencyKey: input.idempotencyKey,
          amount: input.amount,
          status: "created",
          createdAt: new Date().toISOString(),
          // TTL: マーカーを永久に残さない(運用コスト・容量の抑制)
          expiresAt: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 30,
        },
        // PK が存在しない=このアイテムがまだ無い時だけ作成
        ConditionExpression: "attribute_not_exists(PK)",
      }),
    );
    return { created: true };
  } catch (err) {
    if (err instanceof ConditionalCheckFailedException) {
      // 既に作成済み。再送なので「成功」として扱う(exactly-once の収束)
      return { created: false };
    }
    throw err;
  }
}

The point is to not protect idempotency with an app lock or a "dedup-check SELECT." If a concurrent request slips in between the SELECT and the PUT, it breaks. A condition expression is evaluated as the same atomic operation as the write, so it doesn't break even under contention.

The pattern of first standing up a dedicated idempotency-key item (SK = IDEM#<operation>#<key>) as a separate record is also effective. If you save the operation result (response) on that item, on resend you can "just return the saved response," completely avoiding re-execution of side effects. It's auto-deleted after a fixed period via TTL.


4. Atomicity: updates that don't break the balance

Read-modify-write breaks under contention

"Read the balance → subtract in the app → write back" causes a lost update under concurrent execution. The official Alice/Bob example is exactly this: with last-write-wins, one side's update vanishes. For monetary value like a balance or inventory, this is unacceptable.

The trap of atomic counters (ADD)

An atomic counter via UpdateItem's ADD (or SET x = x + :v) can fuse read and write into one instruction, but as the official explicitly states, it is not idempotent.

With atomic counters, the update is not idempotent. The number increases or decreases every time you call UpdateItem.

That is, a retry double-adds/subtracts. The official also says "in situations where over/under-counting is unacceptable, like a banking app, it's safer to use a conditional update rather than an atomic counter." An atomic counter is fine for a view counter, but you must not use it for a balance — that's the boundary line.

The solution: conditional UpdateItem (a debit that doesn't go negative)

Have the DB atomically judge "debit if the balance is enough." Add balance >= :amount to the ConditionExpression, and fail the write itself if it's insufficient.

import {
  UpdateCommand,
} from "@aws-sdk/lib-dynamodb";
import { ConditionalCheckFailedException } from "@aws-sdk/client-dynamodb";

/**
 * 残高からの安全な引き落とし。
 * - balance >= amount を満たす時だけ、原子的に balance を減らす
 * - 競合下でもマイナス残高にならない(read-then-write を排除)
 */
export async function debitBalance(input: {
  userId: string;
  amount: number; // > 0
}): Promise<{ ok: true; newBalance: number } | { ok: false; reason: "insufficient" }> {
  try {
    const res = await ddb.send(
      new UpdateCommand({
        TableName: TABLE,
        Key: { PK: `USER#${input.userId}`, SK: "BALANCE" },
        UpdateExpression: "SET balance = balance - :amt",
        // 「足りるなら引く」をDBが原子的に判定
        ConditionExpression: "balance >= :amt",
        ExpressionAttributeValues: { ":amt": input.amount },
        ReturnValues: "UPDATED_NEW",
      }),
    );
    return { ok: true, newBalance: res.Attributes!.balance as number };
  } catch (err) {
    if (err instanceof ConditionalCheckFailedException) {
      // 残高不足。書き込みは一切起きていない
      return { ok: false, reason: "insufficient" };
    }
    throw err;
  }
}

Because this update's condition and update target are the same balance attribute, it does not become the official's idempotent conditional write (a retry debits again). Therefore, the production pattern is to combine it with the idempotency key (Section 3), judge "is this a resend of the same operation" via the idempotency marker, and then apply this debit exactly once. In my payment platform, this two-stage setup of "idempotency marker + conditional atomic update" keeps double charges and balance inconsistencies at 0.


5. Transactions: multiple items, all-or-nothing

The spec of TransactWriteItems (official — verify the limits)

ItemValue
Max actions100 (up to 100 distinct items)
ScopeOne or more tables in the same AWS account and same region
Total size limit4 MB
Single-item size limit400 KB
Idempotency tokenClientRequestToken. For 10 minutes after completion, a resend with the same token succeeds with no change
Usable actionsPut / Update / Delete / ConditionCheck
IndexesA transaction cannot be run against an index
Same itemThe same item cannot be targeted by multiple actions within one transaction

Key points on official idempotency:

  • A ClientRequestToken is valid for 10 minutes after request completion. Past 10 minutes, the same token is treated as a new request.
  • Resending within the 10-minute idempotency window with the same token but different other parameters raises an IdempotentParameterMismatch exception.

A note on cost

Transactions incur no extra charge per se, but according to the official, two internal reads/writes per item (prepare + commit) run. For example, to write 3 items (500 bytes each) in 1 transaction per second, the estimate is 2 WCU × 3 = 6 WCU per item. This is also reflected in CloudWatch metrics.

Real code: order settlement (debit the balance + update order status, atomically)

import { randomUUID } from "node:crypto";
import {
  TransactWriteCommand,
} from "@aws-sdk/lib-dynamodb";
import { TransactionCanceledException } from "@aws-sdk/client-dynamodb";

/**
 * 「残高を引く」と「注文を paid にする」を all-or-nothing で確定する。
 * どちらか一方だけが反映される状態を作らない。
 */
export async function settleOrder(input: {
  userId: string;
  orderId: string;
  amount: number;
  clientRequestToken?: string; // リトライ時は同じ値を渡すと10分間冪等
}): Promise<void> {
  try {
    await ddb.send(
      new TransactWriteCommand({
        // トークンを呼び出し側が固定できると、再送が安全になる
        ClientRequestToken: input.clientRequestToken ?? randomUUID(),
        TransactItems: [
          {
            Update: {
              TableName: TABLE,
              Key: { PK: `USER#${input.userId}`, SK: "BALANCE" },
              UpdateExpression: "SET balance = balance - :amt",
              ConditionExpression: "balance >= :amt",
              ExpressionAttributeValues: { ":amt": input.amount },
            },
          },
          {
            Update: {
              TableName: TABLE,
              Key: { PK: `ORDER#${input.orderId}`, SK: "META" },
              UpdateExpression: "SET #st = :paid",
              // 二重確定を防ぐ:created の時だけ paid にできる
              ConditionExpression: "#st = :created",
              ExpressionAttributeNames: { "#st": "status" },
              ExpressionAttributeValues: { ":paid": "paid", ":created": "created" },
            },
          },
        ],
      }),
    );
  } catch (err) {
    if (err instanceof TransactionCanceledException) {
      // 残高不足 or 既に paid 済み。CancellationReasons で原因を切り分ける
      throw new Error(`settle canceled: ${err.message}`);
    }
    throw err;
  }
}

When to use a transaction (vs a single conditional write)

The official best practice is "don't group unless you need to."

  • If it completes with a single item, use one conditional write. A transaction consumes double the capacity and is easily canceled under contention.
  • Use TransactWriteItems only when you need an indivisible update of multiple items (balance + order + ledger, etc.).
  • Including multiple highly-contended items in the same transaction increases TransactionCanceledException, so gather frequently co-updated attributes into one item to narrow the transaction's scope.

6. Consistency: strong vs eventual

The official's organization:

  • Eventually consistent reads are the default. The most recent write may not be reflected.
  • Strongly consistent reads can be requested with ConsistentRead: true on GetItem / Query / Scan, returning the latest value reflecting all prior successful writes.
  • GSIs and Streams are always eventually consistent. A strongly consistent read against a GSI/Stream is not supported.
  • Strongly consistent reads are supported only on tables and LSIs.
  • An eventually consistent read costs half of a strongly consistent read.

The practical pitfall is "reading from a GSI right after a write and expecting the latest value." Propagation to a GSI is asynchronous, so if you need the latest right after, use a strongly consistent read against the base table. Right after a transaction completes too, an eventually consistent read can temporarily return a stale value, so set ConsistentRead: true where a confirmed value is required.

import { GetCommand } from "@aws-sdk/lib-dynamodb";

// 引き落とし直後に「確定した最新残高」を見せたい → 強整合読み取り
const res = await ddb.send(
  new GetCommand({
    TableName: TABLE,
    Key: { PK: `USER#${userId}`, SK: "BALANCE" },
    ConsistentRead: true, // テーブル本体に対してのみ有効(GSI不可)
  }),
);

Consistency is a trade-off with cost. For reads where some delay is acceptable, like a list view, eventual consistency (= half price) is enough. Use strong consistency only where a stale value becomes an accident, like "the confirmation right after money moved" — this line-drawing is superior in cost-effectiveness.


7. Operations: cost, observability, DR, zero-downtime migration

Capacity mode (on-demand vs provisioned)

ViewpointOn-demandProvisioned
BillingPay-per-use for RCU/WCU consumedCharged for reserved capacity (Auto Scaling possible)
Suited forHard-to-predict / spiky traffic / early launchStable, predictable load; production you want cost-optimized
Hot partitionEither way, concentrating on a specific key degrades performance and causes throttling

If unsure, first operate on on-demand and gather metrics, then lean toward provisioned + Auto Scaling once steady load is visible — this order is cost-efficient. In any case, key design that avoids hot partitions (the official's Distribute queries) is the premise.

Observability (CloudWatch)

Metrics to watch at minimum: throttling (ReadThrottleEvents / WriteThrottleEvents), TransactionConflict (the number of transaction conflicts), and consumed capacity. In my payment platform, I set CloudWatch alarms on these and notify Slack on threshold breaches. It's important to monitor while distinguishing conditional-write failures (a normal case, like insufficient balance) from throttling (an abnormal case of insufficient capacity).

Backup, PITR, DR

  • Make recovery to any point in the last 35 days possible with PITR (point-in-time recovery).
  • Hold tamper-proof backups against accidental deletion and ransomware with AWS Backup + Vault Lock.
  • A caveat (official): transaction changes propagate gradually to GSIs, Streams, and backups, so a backup/export during propagation may only partially include recent transactions. Assume this asynchrony in DR design.

In my area too, I build DR with AWS Backup, Vault Lock, and PITR, and operate by passing through CI (GitHub Actions, mypy strict) type/test verification gates before reflecting to production.

Zero-downtime migration of a single table

The standard play to evolve the schema (key design) without stopping production is the mirror write (dual write).

  1. Write simultaneously to both the old format and the new format (mirror-write). Reads stay on the old format.
  2. Backfill to convert existing data to the new format.
  3. Switch reads to the new format. Verify consistency.
  4. Remove writes to the old format and the old data.

Each step is rollback-able at any time and can proceed with zero downtime. With this mirror-write approach, I updated the payment platform's data model without stopping production for even one second.


FAQ

Q1. Is single-table design always correct? No. In the early stage where access patterns aren't fixed, or in areas centered on exploratory ad-hoc queries or complex aggregation, an RDB (PostgreSQL, etc.) wins on development speed and flexibility. The official also recognizes time-series data and "completely different access patterns" as exceptions. You should choose it as a means to satisfy fixed patterns at minimum cost and maximum consistency.

Q2. Can idempotency be achieved with an atomic counter (ADD)? No. As the official explicitly states, an atomic counter is not idempotent and double-adds on retry. Achieve idempotency with conditional creation via attribute_not_exists, or with an idempotency-key item + a saved response.

Q3. Is SET balance = balance - :amt alone enough for a balance update? Insufficient. Always add ConditionExpression: "balance >= :amt" and reject insufficient balance on the DB side before the write. Without it, concurrent execution can go negative. Further, combine with an idempotency marker to prevent retry double-debits.

Q4. What are the limits of TransactWriteItems? Up to 100 actions (up to 100 items), 4 MB total, up to 400 KB per single item. Idempotency via ClientRequestToken is valid for 10 minutes after completion, and changing other parameters with the same token causes IdempotentParameterMismatch.

Q5. I read a stale value from a GSI right after a write. Is it a bug? It's by spec. GSIs (and Streams) are always eventually consistent, and propagation is asynchronous. If you need the latest right after, use a strongly consistent read (ConsistentRead: true) against the base table. Strongly consistent reads against a GSI are not supported.

Q6. On-demand or provisioned — which should I choose? On-demand for stages with unpredictable/spiky load or early launch; provisioned + Auto Scaling is cost-efficient for stable, predictable production load. Both presume key design that avoids hot partitions, and I recommend the order of measuring on on-demand first, then migrating.


Closing: guarantee correctness through structure

DynamoDB's reliability is born by correctly stacking the three primitives attribute_not_exists, conditional atomic updates, and TransactWriteItems on top of access-pattern-driven key design. Correctness protected by reviews or runbooks gets broken; correctness protected by condition expressions and transactions does not.

In a one-person × generative-AI (Claude Code) setup, I have thoroughly practiced an approach that passes design judgments through human verification gates, implementing and operating a serverless payment platform with 0 double charges in production. I can accompany you from design review through implementation on DynamoDB-centered serverless reliability design (idempotency, atomic balance updates, zero-downtime migration, DR, observability).

If you're struggling with serverless / DynamoDB reliability design, consult us via Contact. Let's start by organizing your current access patterns and risk areas together.

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading