Skip to main content
友田 陽大
DynamoDB
AWS
DynamoDB
マルチリージョン
DR
アーキテクチャ設計
Terraform
サーバーレス

DynamoDB Global Tables × Multi-Region × Disaster Recovery (DR) Complete Guide (2026 Edition): MREC/MRSC Consistency, Conflict Resolution, RTO/RPO Design, PITR, Cost

We explain multi-active multi-region distribution with DynamoDB Global Tables, faithful to the AWS official specs. We systematize DR design — the difference between eventual consistency (MREC) and multi-region strong consistency (MRSC) and how to choose, last-writer-wins conflict resolution, RTO/RPO and failover, PITR (35 days) and backups, and the cost of replicated write units — in real Terraform/TypeScript code.

Published
Reading time
25 min read
Author
友田 陽大
Share

"I want to make the DB redundant," "I don't want it to stop even on a region outage" — for these requirements, assembling a replication queue on the application side, or hand-implementing dual writes, is almost always wrong.

The core of DynamoDB Global Tables is that you can buy availability and global performance with "the design of replication" rather than "app effort." Since AWS provides multi-region asynchronous/synchronous replication fully managed, the app just reads and writes to its own region's local endpoint. But there's just one premise the designer must understand: the consistency model. Adopt it while misunderstanding this, and you take on hard-to-reproduce, serious accidents like "updates silently vanish" or "you read a stale value when you thought it was strongly consistent."

This article is a systematization of only the DR (disaster recovery) design with Global Tables and multi-region, based on my experience designing and leading the reliability layer of a serverless (Lambda + DynamoDB) multi-tenant payment platform and maintaining 0 double charges in production. Data modeling and idempotency design are left to the Single-Table Design & Production Reliability Patterns Guide, and the basics of capacity and cost to the Capacity, Cost, and Performance Design Guide. Complementary to those, this article narrows in on "how to run it across multiple regions, without stopping, and at what cost."

All specs and limits are cross-checked against the AWS official documentation (as of June 2026). Since pricing changes by region and time, always check amounts on the official pricing page.


1. How Global Tables works: multi-active automatic replication

The official definition is this.

Amazon DynamoDB global tables is a fully managed, multi-Region, and multi-active database feature that provides easy to use data replication and fast local read and write performance for globally scaled applications.

Let me decompose the properties to read from here.

  • A global table = a collection of replica tables spanning 2 or more regions. Only one replica can be placed in one region. All replicas share the same table name, the same primary-key schema, and the same item data.
  • Multi-active (active-active). Any replica accepts reads "and" writes. It's not "1 primary + read replicas." Every region's viewpoint is equal.
  • Automatic replication. When you write to a replica in one region, DynamoDB automatically replicates it to all other replicas. No app-side replication implementation is needed.
  • No app changes. Global Tables uses the existing DynamoDB API as-is. DynamoDB has no global endpoint, and all requests are addressed to a regional endpoint. The app talks to "its own region's local endpoint."

This last point is extremely important for design. The official best practice is clear.

Calls to DynamoDB should not go across Regions.

In other words, the Tokyo app hits only the Tokyo DynamoDB replica. If a problem arises in a region, switch the end-user traffic, along with the app stack, to another region. Global Tables is the foundation that guarantees the state of "apps in all regions can access the same data," and the "decision and routing" of failover is the responsibility of the app/infra side (detailed in Chapter 4).

Availability: single-region 99.99% → multi-region 99.999%

The official numbers.

ConfigurationDesign availability SLA
Single-region table99.99%
Multi-region (Global Tables)99.999%

Even in a single region, DynamoDB auto-replicates data to 3 availability zones (AZs) within 1 region and withstands AZ failures by default. What Global Tables buys on top is "resilience to region failure."


2. The consistency model: MREC vs MRSC (this is the heart of design)

If you use Global Tables, there's just one heavy decision to make first. The consistency mode.

Global tables support two consistency modes: multi-Region eventual consistency (MREC) and multi-Region strong consistency (MRSC).

Let me grasp the important constraints first.

  • If unspecified, the default is MREC (eventual consistency).
  • Within one global table, replicas of different consistency modes cannot be mixed.
  • The mode cannot be changed after creation. You can't go "actually, strong consistency" later. The first design decision is fixed.

2.1 MREC (multi-Region eventual consistency) — default, asynchronous, last-writer-wins

MREC asynchronously replicates an item's changes to all other replicas. The official says it propagates "typically within a second or less." Because the write completes in the local region and returns immediately, the advantage is low write latency.

The problem is conflicts. When the same item is updated nearly simultaneously in multiple regions, what happens?

If the same item is modified in multiple Regions simultaneously, DynamoDB will resolve the conflict by using the modification with the latest internal timestamp on a per-item basis, referred to as a "last writer wins" conflict resolution method.

It's resolved with last-writer-wins. Each item has an internal timestamp, and the one with the larger timestamp wins, and the losing update is silently discarded. The official expression itself is the accident warning.

The first operation "loses" to the second operation. These conflicts aren't recorded in CloudWatch or AWS CloudTrail.

That is, update loss from a conflict is recorded in neither CloudWatch nor CloudTrail. You can't notice it. This is MREC's most serious pitfall, and I describe the countermeasure in Chapter 7.

MREC's strongly consistent reads have a caveat. Even if you add ConsistentRead: true,

Strongly consistent read operations return the latest version of an item if that item was last updated in the Region where the read occurred, but may return stale data if the item was last updated in a different Region.

"It returns the latest if it was last updated in the same region, but may return a stale value for one updated in a different region." MREC's "strong consistency" is, after all, a matter within the local region and is not globally strongly consistent.

2.2 MRSC (multi-Region strong consistency) — introduced June 2025, synchronous, RPO zero

MRSC is a relatively new mode (the official design guide explicitly states "introduced in June 2025").

Item changes in an MRSC global table replica are synchronously replicated to at least one other Region before the write operation returns a successful response.

A write is synchronously replicated to at least one other region before returning success. So a strongly consistent read on any replica always returns the latest, and a conditional write is always evaluated against the latest version. The cost is increased latency for writes and strongly consistent reads (because cross-region communication is involved).

MRSC has many inherent constraints, and choosing it without knowing them leads to a dead end.

  • Exactly a 3-region configuration. Either "3 replicas" or "2 replicas + 1 witness." Adding/removing replicas is not possible (you can't add a replica later to a 2-replica + witness setup).
  • A witness holds the latest data but cannot be read from or written to. It's an alternative to a full replica and supports the availability quorum. A witness incurs no storage or write cost.
  • Limited within a Region set. US (N. Virginia/Ohio/Oregon), EU (Ireland/London/Paris/Frankfurt), AP (Tokyo/Seoul/Osaka). You can't span sets (e.g., you can't create an MRSC mixing US and EU).
  • Can only be created from an empty table. You cannot convert a single-region table with existing data to MRSC.
  • No TTL support, no LSI support, no transaction (TransactWriteItems/TransactGetItems) support.
  • Conflicts are not last-writer-wins; simultaneous updates fail with ReplicatedWriteConflictException (retryable). Instead of implicit update loss via LWW, the app explicitly handles conflicts.

2.3 Comparison table and how to choose

ViewpointMREC (default)MRSC (June 2025~)
ReplicationAsynchronous (typically within 1 second)Synchronous (replicated to 1+ other region before success)
Global strongly consistent readNot possible (another-region update may be a stale value)Always returns the latest
Conflict resolutionlast-writer-wins (the loss silently vanishes)ReplicatedWriteConflictException (explicit failure, retry)
Write latencyLowHigh (the cross-region round-trip)
RPOThe replication lag (typically a few seconds, >0)Zero
Region configurationAny number, any region (within a partition)Exactly 3, within a Region set, no add/remove
TTL / LSI / transactionsTTL OK, transactions atomic within a region onlyAll unsupported
StreamsEnabled by default (used for replication)Disabled by default (not used for replication, optionally enabled)
Conversion from an existing tableAdd a replica to an existing table, OKOnly from an empty table

The official's criterion for choosing is clear.

The key criteria for choosing a multi-Region consistency mode is whether your application prioritizes lower latency writes and strongly consistent reads, or prioritizes global strong consistency.

  • Use MREC when: you can tolerate another-region update temporarily appearing stale in a strongly consistent read / you want to prioritize write and strongly-consistent-read latency / you can tolerate RPO > 0.
  • Use MRSC when: you need global strongly consistent reads spanning multiple regions / you prioritize global consistency over latency / RPO zero is mandatory.

Let me state my frank opinion. For many workloads, MREC alone is sufficient first. MRSC's constraints — "3 regions fixed, limited to Region sets, no TTL/LSI/transactions, from an empty table" — are heavy, and if these don't strike your requirements, the adoption cost isn't worth it. MRSC's value comes out only in the limited case of "updating, with strong consistency from multiple regions, a single item that could be simultaneously updated globally, like money or inventory." Otherwise, MREC + app-side region affinity design (described later) is cheaper, faster, and more flexible.

And if "being able to read and write all data with strong consistency in one region is sufficient" — you don't need Global Tables in the first place, and you should always weigh the option that a single region + strongly consistent reads is the simplest and cheapest.


3. Use cases: low latency, geo-redundancy, DR

The main values of Global Tables (MREC) that the official lists are 3.

Use caseContent
Low-latency readsPlace copies of data near end users to lower the network latency of reads
Low-latency writesWrite to a nearby region to lower write latency (but route carefully so as not to conflict)
Seamless region migrationAdd a new region → remove the old region, to relocate the data layer with zero downtime

And the value common to both MREC and MRSC is the subject of this article.

Increased resiliency and disaster recovery. If a Region has degraded performance or a full outage, you can evacuate it.

If a region has degraded or fully failed, you can "evacuate" that region. With MREC, RPO/RTO is on the order of seconds; with MRSC, RPO is zero. This is the essence of DR with Global Tables. Unlike "restore from backup" DR, fresh data is in all regions from the start, so recovery is not a "restore" but a "re-pointing of traffic."

Note: low-latency "writes" are a double-edged sword. Writing to the same item from multiple regions causes LWW update loss in MREC, and a conflict exception in MRSC. I dig into the point that the write-routing design (which item which region writes) is inseparable from DR design in Chapter 7.


4. DR design: RTO/RPO and the use of the 3 recovery means

DR design is the work of deciding RTO (recovery time objective) and RPO (recovery point objective) first as requirements, and choosing the minimum-cost means that satisfies them. DynamoDB has 3 defenses at different tiers.

Defense tierFailure scopeMechanismRPORTO
Multi-AZ (default, free)AZ failureAuto-replication to 3 AZs within 1 region. No config needed0Transparent (auto-failover)
Global Tables (MREC)Region failureMulti-active replication. Evacuate traffic to another regionA few seconds (replication lag)Seconds~minutes (depends on routing switch)
Global Tables (MRSC)Region failureSynchronous replication, strong consistency0Seconds~minutes (same)
PITR / backupLogical corruption, accidental deletion, ransomwarePoint-in-time restore, snapshot. Restore to a new tableRewinds to the restore pointRestore time (minutes~hours)

Here, let me clearly separate the 2 axes designers tend to confuse.

  • What Global Tables (active-active) protects is "infrastructure failure." Even if a region or AZ goes down, switch to another live replica and continue business. Data is synchronized in real time.
  • What PITR/backup protects is "logical failure." When the data itself is broken/deleted by a bug, an erroneous operation, or ransomware, rewind to a healthy point in time. Global Tables can't protect this — because it faithfully replicates a broken write and a deletion to all replicas too.

These two are not alternatives but both needed. Global Tables for region failure, PITR for data destruction. A DR design that has only one of them is defenseless against the other side's risk.

Failover and region routing

Global Tables guarantees "the availability of data," but it doesn't make the decision of "which region to connect to." As seen in Chapter 1, DynamoDB has no global endpoint, and the app hits its own region's local endpoint. So what decides DR's RTO is, in effect, the app/infra-side routing-switch speed.

A typical configuration distributes the front to multi-region app stacks with Route 53 (health checks + failover routing) or AWS Global Accelerator, and when a region failure is detected, draws traffic to a healthy region (the official's recommended resources also list Route 53 / Global Accelerator / Application Recovery Controller). The app side makes a region-aware client generation so it uses only the DynamoDB endpoint of the region it's currently in.

import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
import { DynamoDBDocumentClient } from "@aws-sdk/lib-dynamodb";

/**
 * リージョン対応クライアント。
 * 原則:アプリは「自分が動いているリージョンのローカルエンドポイント」だけを叩く
 *       (公式:DynamoDBへの呼び出しはリージョンをまたぐべきではない)。
 * フェイルオーバーは Lambda/ECS を別リージョンで実行することで実現し、
 * クライアント側でリージョンを跨ぐリトライはしない(クロスリージョン呼び出しを避ける)。
 */
const HOME_REGION = process.env.AWS_REGION ?? "ap-northeast-1";

// SDK のリージョン解決は AWS_REGION に従うので、明示しておくと意図が明確になる
export const ddb = DynamoDBDocumentClient.from(
  new DynamoDBClient({ region: HOME_REGION }),
  { marshallOptions: { removeUndefinedValues: true } },
);

/**
 * 書き込みのリージョンアフィニティ(MRECでの競合回避):
 * 「あるエンティティの書き込みは、常に同じ"ホームリージョン"で行う」よう
 * 上位のルーティング層(Route 53 等)で寄せるのが定石。
 * これにより同一アイテムの同時マルチリージョン更新=LWW更新喪失を構造的に防ぐ。
 */
export function homeRegionFor(/* tenantId: string */): string {
  // 例:テナントIDのハッシュでホームリージョンを決め、書き込みを1リージョンに固定する。
  // 読み取りは任意のローカルリージョンから(低レイテンシ)。
  return HOME_REGION;
}

The point is to not implement failover with the SDK's retry. A cross-region call is a latency factor the official explicitly says to avoid, and a client that goes to hit another region's endpoint on a region failure worsens latency and cost in healthy times too. "Run the whole app stack in another region too, and switch at the entrance's routing" is the correct layering.


5. The use of PITR vs on-demand backup

The defenses against logical failure are PITR and on-demand backup. The two have different roles.

PITR (point-in-time recovery)

Point-in-time recovery (PITR) backups are fully managed by DynamoDB and provide up to 35 days of recovery points at a per second granularity.

  • Holds continuous backups of up to 35 days at a per-second granularity. You can shorten RecoveryPeriodinDays between 1 and 35 days, and shortening doesn't change the PITR price (the price is based on the table + LSI size).
  • After enabling, you can restore to any point in time from "5 minutes before now" to the configured retention period.
  • Restore is always to a new table. It doesn't overwrite the original table.
  • No impact on performance or API latency.
  • With Global Tables, you can enable PITR individually per replica. A restore becomes a standalone table independent of the global table (with the current version 2019.11.21, you can recreate a new global table from the restored table).
  • Cross-region restore is also possible (you can create the restore-destination table in another region).

On-demand backup

You can use the DynamoDB on-demand backup capability to create full backups of your tables for long-term retention, and archiving for regulatory compliance needs.

  • Take a full backup at any timing. For long-term retention and regulatory-compliance archiving.
  • No impact on production-table performance or availability. Supports from several MB to several hundred TB.
  • All automatically encrypted and cataloged. Retained until explicitly deleted.
  • Can be integrated with AWS Backup, and via AWS Backup you can auto-replicate backups across regions to further increase resilience.

Use-case table

ViewpointPITROn-demand backup
TypeContinuous backup (point-in-time restore)Snapshot
RetentionUp to 35 days (configurable 1–35)Indefinite until deleted
GranularityAny point in time at per-secondOnly the moment taken
Main useRewinding from a recent erroneous operation/corruptionLong-term retention, regulatory archive, fixing a milestone
Restore destinationA new table (cross-region OK)A new table

The official Global Tables best practice shows a realistic guideline.

Enabling automated backups and Point-in-Time Recovery (PITR) for one replica in a global table may be sufficient to meet your disaster recovery objectives.

Enabling PITR/backup for "one replica" of a global table is often sufficient to meet DR objectives. There's usually no need to take it redundantly across all replicas (PITR settings aren't synchronized between replicas = managed individually, so explicitly decide where to take it). In my practice, I design it in two stages: PITR for recent logical corruption (narrow RecoveryPeriodinDays for tables where short retention is enough), and on-demand backup + AWS Backup for fixed points for regulation/audit.


6. Cost: the concept of replicated write units

Global Tables' cost can be calculated once you understand the point that "how writes are counted" changes from a single region. The basics of reads and storage are left to the Capacity, Cost, and Performance Design Guide; here I handle only the multi-region-specific difference.

The official billing model. When you add a replica to a single-region table to make it a global table, the billing unit for writes changes.

Replicated Write Request Units (rWRUs) for on-demand capacity mode, where one rWRU per replica table is charged for each write up to 1KB

  • Billed in replicated write units (rWRU = on-demand / rWCU = provisioned). For each write up to 1KB, 1 unit per replica table.
  • The unit price is the same as the single-region write unit (WRU/WCU). It neither becomes cheaper nor more expensive; the structure is that "you're billed for the number of regions."
  • rWRU/rWCU billing occurs in all regions holding a replica.
  • On top of this, a cross-region data transfer fee is incurred for the inter-region replication.
  • Storage is per region (because each replica holds all the data, effectively storage × number of replicas).
  • GSI updates are billed in normal WRU/WCU (even for a GSI on a replica table, it's the normal unit, not the r unit).
  • There's no difference in the rWRU/rWCU unit price by consistency mode (MREC and MRSC are the same). An MRSC witness is billed for none of rWRU/rWCU, storage, or replication data transfer.

The official billing example (writes only)

Reading the official example, the cost structure is clear at a glance.

  • Day 1 (single region): write 100 1KB items to us-west-2 → 100 WRU.
  • Day 2 (globalized by adding a replica): add a replica to us-east-2. Write 150 items to us-west-2 → 150 rWRU to us-west-2 + 150 rWRU to us-east-2 = 300 rWRU total.

The moment you make it a 2-region configuration, the same 150 writes become "300 write-unit equivalent." This is Global Tables' biggest cost factor. Write cost is roughly proportional to the number of replica regions, plus cross-region transfer and region × storage on top.

A rough estimation guideline: "making it a 3-region global table makes write cost about 3× a single region + cross-region transfer + storage × 3." Reads are local reads in each region, so low-latency, but the more write-heavy the workload, the more multi-region's cost increase takes effect.

That's exactly why "don't unconditionally globalize all tables" is the iron rule. Make only the tables you can explain as "can't stop on a region failure / need global low latency" into Global Tables, and for the rest, single region + (if needed) cross-region PITR restore — this line-drawing is superior in cost-efficiency.


7. Pitfalls: LWW update loss, idempotency, clocks, uniqueness, replication lag

Global Tables is powerful, but the difficulty of distribution doesn't disappear; it just moves to AWS's layer. Let me list the pitfalls the designer must keep taking on.

(1) MREC's last-writer-wins = update loss (most important)

As in Chapter 2, in MREC a simultaneous multi-region update to the same item silently erases one side with last-writer-wins, and moreover isn't recorded in CloudWatch/CloudTrail. Writing a read-then-increment/decrement value like a balance, inventory, or counter from multiple regions is dangerous in MREC.

The countermeasure is "write region affinity (home Region pinning)" that the official design guide recommends. Draw writes via the routing layer so that a given entity's writes always happen in the same single home region. Reads from any local region (low latency). This way, the very situation of the same item being written simultaneously in 2 regions doesn't occur, so LWW never comes into play. Don't naively operate "since it's multi-active, I can write anything from all regions" — this is the cornerstone of MREC operation. If you have no choice but to strongly-consistently update the same item globally, that's where MRSC comes in.

(2) Idempotency is still the app's responsibility

During failover on a region failure, retries and resends increase. Global Tables does not provide idempotency. Patterns like create-exactly-once with attribute_not_exists, or an idempotency key + a saved response, are unchanged and still needed in multi-region too (details of idempotency patterns). Think of failover as the moment idempotency is tested.

(3) Transactions are atomic only within a region (MREC)

This is an easy-to-overlook critical spec. In MREC,

Transactional writes are not replicated as a unit across Regions, meaning only some of the writes in a transaction may be returned by read operations in other replicas at a given point in time.

TransactWriteItems is atomic only within the region that executed it, and is not replicated as a unit across regions. Multiple items transacted in one region may be temporarily observed as partially applied in another region. Misunderstand it as "transactions are atomic in all regions," and you pick up a half-baked state on another region's read. As for MRSC, transactions themselves are unsupported.

(4) Clocks and timestamps

MREC's LWW is based on the internal timestamp. This is a private system property managed by DynamoDB, not the app's updatedAt. That said, combining a conditional write (optimistic lock) with an app-side version number or updatedAt lets you detect an unexpected overwrite early. In multi-region, not writing "distributed logic that trusts the local clock" is the safe side.

(5) Global uniqueness

DynamoDB's uniqueness guarantee holds via an item-level conditional write (attribute_not_exists), but in MREC this is evaluated region-locally. If 2 regions try to "create the same new key simultaneously," both succeed locally, and they converge to one side via LWW at replication time (= one creation is effectively lost). If you need strict global uniqueness, pin that key's creation to a single home region, or use MRSC. "An ID that should be unique became 1 item after replication" is a typical accident of MREC + multi-region writes.

(6) Design on the premise of replication lag

MREC typically propagates within 1 second, but there's no SLA, and it varies with inter-region distance and workload. Monitor the ReplicationLatency metric (per region pair), and operate so that when it increases, you draw traffic to another region. The app avoids a design of "read in another region right after writing and expect the latest," and aligns the read and write regions so it reads in the region it wrote (satisfying read-your-writes locally).


8. Terraform: a multi-region global table (MREC)

Let me drop the design decisions so far into IaC. Adding a region with aws_dynamodb_table's replica block makes that table a global table (current version 2019.11.21). PITR is per replica, and I make billing on-demand to keep "only what you use + the number of regions" straightforward.

# プロバイダはリージョンごとに alias を分ける(クロスリージョン管理)
provider "aws" {
  alias  = "tokyo"
  region = "ap-northeast-1"
}

# グローバルテーブルの「定義元」リージョン。replica で他リージョンを足す。
# 公式ベストプラクティス:CloudFormation/Terraform では1つの基準リージョンに
# 全レプリカ定義を集約し、リージョン別スタックに分散させない(ドリフト回避)。
resource "aws_dynamodb_table" "app" {
  provider = aws.tokyo

  name         = "AppTable"
  billing_mode = "PAY_PER_REQUEST" # オンデマンド:rWRU で「使った分×リージョン数」課金
  hash_key     = "PK"
  range_key    = "SK"

  # マルチアクティブ複製に Streams が要る(NEW_AND_OLD_IMAGES が定石)
  stream_enabled   = true
  stream_view_type = "NEW_AND_OLD_IMAGES"

  attribute {
    name = "PK"
    type = "S"
  }
  attribute {
    name = "SK"
    type = "S"
  }

  # 大阪・オレゴンへレプリカを追加 → グローバルテーブル化(MREC)
  # PITR と削除保護はレプリカ間で同期されないので、各レプリカで明示する
  replica {
    region_name            = "ap-northeast-3" # 大阪:国内の地理冗長
    point_in_time_recovery = true
  }
  replica {
    region_name            = "us-west-2" # オレゴン:別大陸の地理冗長
    point_in_time_recovery = true
  }

  # 定義元リージョンの PITR
  point_in_time_recovery {
    enabled = true
  }

  # 本番テーブルの誤削除を止める(各レプリカで個別に有効化が必要)
  deletion_protection_enabled = true

  ttl {
    attribute_name = "expiresAt"
    enabled        = true # 注:TTL複製削除はレプリカ側で rWRU/書き込みを消費する
  }

  tags = {
    Environment = "production"
    DR          = "multi-region-mrec"
  }

  # レプリカのキャパシティはグローバルで同期されるため、
  # billing_mode やキー定義の変更は全レプリカに波及する点に注意。
  lifecycle {
    ignore_changes = [replica] # 手動でのレプリカ増減との競合を避けたい場合
  }
}

Design caveat: MRSC currently has constraints that can't be expressed by a naive replica-block specification (exactly 3 regions, limited to Region sets, empty table, witness), and its handling in Terraform/console differs from MREC. If you adopt MRSC, always check the constraints of the consistency mode and the 3-region configuration before provisioning (this article's Terraform is an MREC configuration). The point that TTL's replicated deletion consumes write units on the replica side should also be included in capacity estimation if provisioned.


FAQ

Q1. Is Global Tables' consistency strong or weak? It depends on the mode. The default MREC is eventually consistent, and changes propagate to other regions asynchronously, typically within 1 second (not globally strongly consistent). MRSC (introduced June 2025) is strongly consistent, and a write is synchronously replicated to another region before success, returning the latest on any replica. In exchange, MRSC's write latency rises, and it has constraints like being 3-regions-fixed.

Q2. How are conflicts (simultaneous updates) resolved? MREC is last-writer-wins — the one with a newer internal timestamp wins, and the losing update is silently discarded and isn't recorded in CloudWatch/CloudTrail either. To avoid update loss from this, drawing writes to a single home region (region affinity) is the standard play. MRSC is not last-writer-wins; it makes a simultaneous update explicitly fail with ReplicatedWriteConflictException and resolves it with a retry.

Q3. What are DR's RPO/RTO? MREC has RPO of the replication lag (typically a few seconds, not zero), and RTO of seconds~minutes (actually decided by the app's region-routing switch speed). MRSC has RPO zero. Furthermore, even a single region transparently withstands AZ failure with 3-AZ auto-replication. The RPO/RTO against logical destruction (accidental deletion, bugs) is decided not by Global Tables but by PITR/backup restore.

Q4. How many days can PITR retain? What's the difference from on-demand backup? PITR can restore to any point in time at per-second granularity for up to 35 days, with retention configurable 1–35 days. You can go back from 5 minutes before now to within the retention period. On-demand backup is a snapshot, retained indefinitely until deleted — for long-term retention and regulatory archives. For both, restore is to a new table, with no impact on production performance. Use PITR for recent accidents, backups for milestone fixed points.

Q5. How many times the single-region cost does it become? Writes are billed in replicated write units (rWRU/rWCU), for the number of replica regions. The unit price is the same as a single region, so for 3 regions, write cost is roughly 3× + cross-region transfer fee + storage × number of regions. Reads are local reads in each region. The r unit price doesn't change by consistency mode, and an MRSC witness is billed for none of rWRU, storage, or transfer. The more write-heavy the table, the larger the increment, so narrow globalization to only the necessary tables.

Q6. Is region-failure failover automatic? DynamoDB's data layer is automatically synchronized to all regions, but the switch of "which region to connect to" is not automatic. DynamoDB has no global endpoint, and the app hits its own region's local endpoint. Failover is achieved by running the app stack in another region too and switching the entrance with Route 53 / Global Accelerator, etc. The iron rule is to not implement it with the SDK's cross-region retry.

Q7. Should all tables be Global Tables? No. Because writes are billed for the number of regions and storage also costs for the number of regions, cost structurally increases. Globalize only the tables you can explain as "can't stop on a region failure" or "need global low latency," and for the rest, single region + cross-region PITR restore as needed — this line-drawing is superior in cost-efficiency.


Closing: buy availability with "design," protect consistency with "understanding"

DynamoDB's multi-region/DR is achieved not by writing replication in the app but by the design choice of Global Tables. Let me condense the key points.

  • Buy availability and global performance with replication design: on top of a single region's 3-AZ auto-replication (SLA 99.99%), add region resilience (99.999%) with Global Tables.
  • Decide the consistency model first: the default MREC (eventual, low latency, RPO a few seconds, LWW), or MRSC (strong consistency, RPO zero, 3-regions-fixed, many constraints). Unchangeable after creation.
  • Infrastructure failure → Global Tables, logical failure → PITR/backup: both needed. Global Tables faithfully replicates a broken write too.
  • Failover is the app layer's responsibility: the routing switch decides RTO. Don't make cross-region calls.
  • Seal MREC's update loss with region affinity: draw the same item's writes to a single home region. MRSC only where global strongly consistent updates are needed.
  • Cost is proportional to the number of regions: writes are for the number of regions in replicated units, and storage is for the number of regions too. Narrow globalization to the necessary tables.

In a one-person × generative-AI (Claude Code) setup, I have thoroughly practiced an approach that passes design judgments through human verification gates, implementing and operating a serverless payment platform with 0 double charges in production. From the selection of the consistency model, region routing and failover, RTO/RPO-based PITR/backup design, to estimating replication cost — I'll work out DynamoDB's multi-region/DR design with you, matched to your requirements.

If you're struggling with multi-region / DR design, consult us via Contact. Let's start by sorting out the RTO/RPO you must protect, and the selection of which tables should truly be globalized.

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading