# PostgreSQL streaming replication and high availability (HA, sync/async, read replicas, failover, v18)

> A practical guide to achieving PostgreSQL high availability (HA) with streaming replication. Faithful to the official docs, it explains building physical replication, the durability-vs-latency trade-off of sync/async (synchronous_commit), read replicas and replication conflicts on a hot standby, the fact that the PostgreSQL core doesn't provide automatic failover and STONITH, WAL retention via replication slots, and PostgreSQL 18 improvements.

- Published: 2026-06-23
- Author: 友田 陽大
- Tags: PostgreSQL, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/postgresql-streaming-replication-high-availability-failover-guide
- Category: PostgreSQL operations & reliability
- Pillar guide: https://tomodahinata.com/en/blog/postgresql-production-operations-guide

## Key points

- Physical streaming replication = continuously stream the primary's WAL to a standby and keep a nearly-latest replica waiting. The premise is the whole cluster and the same major version.
- Sync/async is a durability-vs-latency trade-off. The default is async (synchronous_standby_names empty). synchronous_commit gets stronger and slower in the order off→local→remote_write→on→remote_apply.
- A hot standby can offload reads, but the standby's long-running queries conflict with the primary's VACUUM. Adjust with max_standby_streaming_delay or hot_standby_feedback (the latter has a primary-bloat side effect).
- Most important: the PostgreSQL core doesn't provide automatic failover (officially stated). Build detection, promotion, IP switching, and split-brain prevention (STONITH) with an external tool like Patroni, or managed.
- Replication slots are the safe way to retain WAL, but risk pg_wal growing unbounded while the standby is down. Cap it with max_slot_wal_keep_size. PG18 adds idle_replication_slot_timeout, etc.

---

Servers always break. Disks die, AZs go down, kernels panic. **Betting everything on a single PostgreSQL** is the same as signing a contract to "pay in downtime" for the failure that will surely come.

The basis of high availability (HA) is **replication** — always have a nearly-latest replica on another server and switch over when it counts. This article explains how to build HA with PostgreSQL physical streaming replication and **the reality of the most-misunderstood "automatic failover,"** faithful to the official docs. It's a deep dive of [production-operations guide §3](/blog/postgresql-production-operations-guide).

> **Rules for this article**: replication specs, parameter defaults, and PostgreSQL 18 changes are based on the **PostgreSQL 18 official documentation (as of June 2026).** Failover-automation tools (Patroni, etc.) are within the scope the official docs collectively call "external tools," with specific names noted as community knowledge.

---

## 1. HA options: physical vs logical vs shared disk

PostgreSQL HA has multiple approaches (the official comparison).

| Method | Mechanism | Suited for | Constraint |
| --- | --- | --- | --- |
| **Physical streaming** | WAL to the standby byte-by-byte | Whole-cluster HA/DR, read replicas | Same major version, whole-replication only |
| **Logical replication** | Row changes via publish/subscribe | Table-level, **cross-version**, bidirectional | Needs conflict resolution, DDL not replicated ([separate article](/blog/postgresql-logical-replication-cdc-zero-downtime-upgrade-guide)) |
| **Shared disk** | Share a single storage | — | The storage itself is a SPOF, no replication |

This article's main subject is **physical streaming replication.** The official explanation:

> Streaming replication allows a standby server to stay more up-to-date than is possible with file-based log shipping. The standby connects to the primary, which **streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled.**

It's "nearly real-time, byte-level replication." Being physical, the standby is **block-for-block identical** to the primary — the same major version is the premise.

---

## 2. Building physical streaming replication

The flow of the minimal setup.

### 2.1 Primary-side configuration

```ini
# primary postgresql.conf
wal_level = replica            # replica 以上（既定）
max_wal_senders = 10           # 同時WAL送信プロセス数（既定 10）
max_replication_slots = 10     # レプリケーションスロット数（既定 10）
```

```sql
-- 専用のレプリケーションロール（公式推奨：REPLICATION + LOGIN の最小権限）
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD '***';
-- スロットを作っておくと、スタンバイが遅れてもWALを保持してくれる（§5）
SELECT pg_create_physical_replication_slot('standby1');
```

```text
# primary pg_hba.conf（replication は専用キーワード。TLS強制＋SCRAM）
hostssl replication replicator 10.0.0.0/24 scram-sha-256
```

### 2.2 Building the standby

```bash
# プライマリからベースバックアップを取得し、復旧設定まで書き出す（-R）
pg_basebackup -h primary.internal -U replicator -D /var/lib/postgresql/18/main \
  -Ft -z -X stream -c fast -R -S standby1
#   -R: standby.signal と primary_conninfo を自動生成 / -S: 上で作ったスロットを使う
```

The `postgresql.auto.conf` that `-R` generates is written like this.

```ini
primary_conninfo = 'host=primary.internal user=replicator passfile=... sslmode=verify-full'
primary_slot_name = 'standby1'
```

And because there's a `standby.signal` file, on startup it comes up in standby mode and keeps receiving the primary's WAL. Thanks to `hot_standby` (default `on`), **you can run read queries on the standby** (§4).

```bash
pg_ctl -D /var/lib/postgresql/18/main start
```

---

## 3. Sync vs async: a durability-vs-latency trade-off

"Whether to wait for the standby's replication before returning a commit" is **sync/async.** The default is **async** — officially, "if no synchronous standby name is specified in `synchronous_standby_names`, synchronous replication is disabled and commits don't wait for replication. **This is the default.**"

Choose the durability level with the `synchronous_commit` value (default `on`). The lower you go, the **stronger and slower.**

| Value | What the commit waits for | What's protected |
| --- | --- | --- |
| `off` | Waits for nothing (not even local flush) | Fastest. Can lose the latest few on a crash (no corruption) |
| `local` | Only local disk flush | Single-node durability |
| `remote_write` | The standby **receives and OS-writes** the WAL | Survives a standby PostgreSQL crash (an OS crash is separate) |
| `on` (default) | The standby **disk-flushes** the WAL | Not lost unless the primary + all sync standbys are corrupted simultaneously |
| `remote_apply` | The standby **applies (makes visible)** the WAL | Immediately reflected in the standby's queries. Slowest |

```ini
# 「1件も失えない」要件：同期レプリケーション
synchronous_standby_names = 'standby1'   # このスタンバイの確認を待つ
synchronous_commit = on                   # 受信＋flush を待つ
```

> **Design decision**: if "commit = absolutely never lost" is a requirement, like finance/payments, sync (`on` or higher). General web apps are often fine with **async** (latency-first, accepting a slight data-loss risk). `remote_apply` is only when you want immediate reflection on the read replica — the commit delay grows large.

---

## 4. Read replicas: offload reads (and its price)

A hot standby can take **read-only queries.** Official: "when `hot_standby` is true, it accepts connections once recovery reaches a consistent state. **All connections are strictly read-only** (not even writing to a temp table)." The standard configuration of offloading heavy aggregations/reports to the replica and letting the primary concentrate on writes.

But there's the inherent problem of **replication conflicts.** Official: "the most common cause of conflict between standby queries and WAL replay is 'early cleanup.' … The primary's cleanup (VACUUM) **may delete row versions that a standby's transaction can still see.**"

The standby is forced to either "delay WAL replay" or "**cancel the conflicting query.**" Adjust with two knobs.

```ini
# ノブ1：競合時、どれだけWAL適用を遅らせるか（超えたらクエリをキャンセル）
max_standby_streaming_delay = 30s

# ノブ2：スタンバイの実行中クエリをプライマリにフィードバックし、VACUUMの早期クリーンを抑止
hot_standby_feedback = on    # 既定 off
```

The official warning: `hot_standby_feedback = on` can eliminate conflicts but "**delays cleanup of dead rows on the primary, potentially causing table bloat.**" It's a trade-off — **`hot_standby_feedback` if you run long analytics queries on the replica, `max_standby_streaming_delay` if you want to avoid primary bloat** — design the acceptable delay.

---

## 5. The WAL-retention safety device: slots vs wal_keep_size

If the primary deletes WAL the standby needs while the standby is behind/down, **replication breaks.** Two mechanisms prevent this.

- **`wal_keep_size`** (default 0): the **minimum size** to retain in `pg_wal`. A standby that falls behind beyond this loses the needed WAL and disconnects. Just a fixed buffer.
- **Replication slot**: **reliably retains** WAL until the standby consumes it. Safer.

The slot is safer but has a **fatal side effect.** Official (`max_slot_wal_keep_size`, default -1): "if -1, a replication slot **may retain unlimited WAL.**" That is, **leaving a standby down causes the primary's `pg_wal` to swell unboundedly and eat the disk.**

```ini
# スロットは使うが、保持上限を設けて pg_wal の暴走を防ぐ
max_slot_wal_keep_size = 64GB   # これを超えたら、遅れたスロットは無効化（複製は切れるがDBは守る）
```

> **The iron rule of operations**: a slot is "convenient but dangerous." **Always set a cap with `max_slot_wal_keep_size`** and monitor the lag in `pg_replication_slots`. PostgreSQL 18 added **`idle_replication_slot_timeout`**, letting you auto-invalidate slots idle for a long time.

---

## 6. Failover: PostgreSQL doesn't do it automatically

This is **the most misunderstood point.** The operation to "promote" a standby itself is easy.

```sql
-- スタンバイをプライマリに昇格（手動フェイルオーバー）
SELECT pg_promote();
-- または: pg_ctl promote -D <datadir>
```

But **who decides "when to promote"?** The official docs state clearly.

> PostgreSQL does **not provide the system software required** to detect a primary failure and notify the standby.

That is, **automatic failover isn't in the core.** Plus the danger of split-brain (both think they're the primary):

> A mechanism is needed to tell the old primary "you're no longer the primary." This is known as **STONITH (Shoot The Other Node In The Head)**, and is needed to avoid a situation where both systems think they're the primary, leading to **data loss.**

The official docs only state that "many such tools exist and are well-integrated with OS features (like IP-address migration)." In practice, you build it with the following external tools (community standard):

- **Patroni** (integrates with etcd/Consul, the de facto)
- **repmgr**, **pg_auto_failover**

> **This is managed's biggest value** ([operations guide §8](/blog/postgresql-production-operations-guide)). RDS/Aurora Multi-AZ does this "**failure detection + automatic promotion + endpoint switching + STONITH-equivalent**" for you. For self-operation, the real cost is **building and operating this automatic-failover foundation**, more than building replication.

---

## 7. Monitoring: watch the lag and the slots

```sql
-- プライマリ側：各スタンバイの遅延（write/flush/replay lag）
SELECT application_name, state, sync_state,
       write_lag, flush_lag, replay_lag
FROM pg_stat_replication;

-- スロットの滞留（restart_lsn が進まない＝WALが溜まる）
SELECT slot_name, active, wal_status,
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_wal
FROM pg_replication_slots;
```

`replay_lag` continuously increasing = the standby can't keep up (insufficient I/O or replication conflicts). `retained_wal` keeps increasing = the slot is stuck and `pg_wal` is swelling — always set a cap and alerts.

---

## 8. Summary

- **Build HA with physical streaming replication**: `wal_level=replica`, a dedicated role, `pg_basebackup -R`, a slot.
- **Sync/async is a durability-vs-latency trade-off.** The default is async. Choose `synchronous_commit` by requirement.
- **Read replicas** can offload reads, but watch for **replication conflicts** (`max_standby_streaming_delay` / `hot_standby_feedback`).
- **The core has no automatic failover** (officially stated). Do detection, promotion, and STONITH with **an external tool like Patroni, or managed.**
- **Slots are safe but can run away** — set a cap with `max_slot_wal_keep_size` and monitor the lag.

Once "don't stop on failure" is settled, next is "**don't stop on change**" — non-stop schema changes, and cross-version [logical replication & zero-downtime upgrade](/blog/postgresql-logical-replication-cdc-zero-downtime-upgrade-guide).

---

### References (PostgreSQL 18 official documentation)

- [26.1. Comparison of Different Solutions](https://www.postgresql.org/docs/18/different-replication-solutions.html)
- [26.2. Log-Shipping Standby Servers (building streaming)](https://www.postgresql.org/docs/18/warm-standby.html)
- [26.3. Failover](https://www.postgresql.org/docs/18/warm-standby-failover.html)
- [26.4. Hot Standby](https://www.postgresql.org/docs/18/hot-standby.html)
- [19.6. Replication (synchronous_commit / slots)](https://www.postgresql.org/docs/18/runtime-config-replication.html)
