# Practical PostgreSQL index design (B-tree / GIN / GiST / BRIN, composite column order, covering, partial, expression indexes, CONCURRENTLY)

> A practical guide so you don't get lost in 'which type, in which order, how far to index' for PostgreSQL indexes. Faithful to the official docs, with real code it explains the proper use of B-tree/Hash/GiST/SP-GiST/GIN/BRIN, the iron rule of composite-index column order, covering indexes via INCLUDE, partial and expression indexes, the non-stopping CREATE INDEX CONCURRENTLY, and PostgreSQL 18's B-tree skip scan.

- Published: 2026-06-25
- Author: 友田 陽大
- Tags: PostgreSQL, パフォーマンス, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/postgresql-index-design-btree-gin-gist-brin-covering-guide
- Category: PostgreSQL internals & performance
- Pillar guide: https://tomodahinata.com/en/blog/postgresql-performance-tuning-production-guide

## Key points

- Type selection: B-tree for equality/range/sorting, GIN for arrays/JSONB/full-text search, GiST for geometry/ranges/nearest-neighbor, BRIN for huge tables with aligned physical order. Hash is equality-only with limited use.
- The iron rule of composite B-tree: the equality-narrowing column first, the range-narrowing column next. 4+ columns is mostly waste. PostgreSQL 18's skip scan increased cases where it works even without an equality on the leading column.
- A covering index (INCLUDE) + index-only scan 'answers with the index alone,' but if the visibility map isn't all-visible it goes to look at the heap. It works on tables where VACUUM is effective.
- Partial indexes work for 'don't index common values' and 'unique constraint on only a subset.' Expression indexes index the result of a computation like lower(col) (a trade-off with the cost of recomputing on each INSERT/update).
- Adding an index in production is CREATE INDEX CONCURRENTLY (doesn't stop writes). On failure an INVALID index remains, so DROP and rerun or REINDEX CONCURRENTLY.

---

An index is not "magic that makes things fast." It's a trade-off that **speeds up reads in exchange for slowing writes, eating capacity, and inviting bloat.** So "index every column" is the worst move. The correct answer is "**index only the predicates the top `pg_stat_statements` queries use, with the appropriate type.**"

This article organizes that "appropriate type," "appropriate column order," and "appropriate scope," faithful to the PostgreSQL official documentation and with practical decision criteria. It's the first layer of the "order of effectiveness" shown in §1 of the [performance overview](/blog/postgresql-performance-tuning-production-guide) — the area with the highest cost-effectiveness.

> **Rules for this article**: index types' characteristics, supported operators, SQL syntax, and PostgreSQL 18's new features are all based on the **PostgreSQL 18 official documentation (as of June 2026).** Because which index the planner actually uses is decided by cost and statistics, **always confirm the final judgment with `EXPLAIN (ANALYZE, BUFFERS)`.**

---

## 1. First choose the type: the proper use of 6 kinds

PostgreSQL has six index access methods. 90% of cases are B-tree, but for "problems B-tree can't solve," the other types work.

| Type | Good query | Supported operators (excerpt) | When to use |
| --- | --- | --- | --- |
| **B-tree** | equality, range, sorting | `< <= = >= >`, `BETWEEN`, `IN`, `IS NULL`, `LIKE 'foo%'` | default. most columns are this |
| **Hash** | equality only | `=` | huge values with equality only. Limited use |
| **GiST** | geometry, ranges, nearest-neighbor (KNN), exclusion constraints | `<<`, `&&`, `@>`, `<->`, etc. | location info, `tsrange` overlap, `EXCLUDE` |
| **SP-GiST** | unbalanced tree structures, points | varies by class | quadtrees, k-d trees, radix trees (trie) |
| **GIN** | searching the contents of composite values | `@>`, `?`, `&&`, etc. | arrays, **JSONB**, full-text search `tsvector` |
| **BRIN** | huge tables with aligned physical order | `< <= = >= >` | time series, append-only logs. The index is tiny |

### B-tree (default) — the all-rounder for equality/range/sorting

The supported operators the official lists are `< <= = >= >`. In addition, `BETWEEN`, `IN`, `IS NULL`, `IS NOT NULL` can be treated as equivalent constructs. Two more important points.

- **Prefix-match `LIKE`/regex**: if **the prefix is fixed by a constant** like `col LIKE 'foo%'` or `col ~ '^foo'`, the index works. `LIKE '%bar'` (suffix match) doesn't work.
- **Sorting**: B-tree preserves sort order, so it can satisfy `ORDER BY` with the index (the sort processing can be omitted).

> **Note (locale)**: outside the C locale, making a prefix-match `LIKE` use an index requires the dedicated operator class **`text_pattern_ops`.**

```sql
-- 等値・範囲は素直に B-tree
CREATE INDEX idx_orders_created_at ON orders (created_at);

-- 前方一致検索を効かせる（非Cロケールでは text_pattern_ops が必要）
CREATE INDEX idx_users_email_prefix ON users (email text_pattern_ops);
-- これで WHERE email LIKE 'tanaka%' が索引を使える
```

### Hash — equality-only, B-tree is mostly enough

Hash stores a 32-bit hash of the value and does **equality comparison only.** No range, no sorting. WAL support (crash-safe) came in PG10, but except in the limited situation of "wanting to make it smaller than B-tree on a huge column used only for equality," B-tree is enough in practice.

### GIN — search "multiple values in one row" (arrays, JSONB, full-text)

GIN is an **inverted index**, for data where "one value contains multiple components." It speeds up array containment `@>`, JSONB key existence `?` or containment `@>`, and full-text search `tsvector @@ tsquery`. JSONB index design is deep, so it's detailed in the [JSONB article](/blog/postgresql-jsonb-operators-gin-index-design-guide).

```sql
-- タグ配列の「このタグを含む行」を高速化
CREATE INDEX idx_articles_tags ON articles USING gin (tags);
-- WHERE tags @> ARRAY['postgresql'] が GIN を使う

-- 全文検索（日本語は専用設定が必要だが構造は同じ）
CREATE INDEX idx_docs_fts ON docs USING gin (to_tsvector('english', body));
```

> GIN has **higher build and update cost** (updates are buffered in an internal pending list). In PostgreSQL 18, **parallel build of GIN indexes** came in, making creation on huge tables faster.

### BRIN — the "almost-free index" for time series / append logs

BRIN holds only **a summary of the min/max value** per range of consecutive physical blocks. The index size is hundreds to thousands of times smaller than a B-tree. But it works only when **the column's value is strongly correlated with the physical row order** — the typical case is "a time-series table appended in ascending `created_at` order."

```sql
-- 数億行の時系列ログ。created_at が物理順とほぼ一致するなら BRIN が劇的に小さく速い
CREATE INDEX idx_events_ts_brin ON events USING brin (created_at);
```

---

## 2. Composite-index column order: 90% get this wrong

A multicolumn B-tree (up to 32 columns, including `INCLUDE`) lives and dies by **column order.** Citing the official iron rule.

> A multicolumn index is **most efficient when there are constraints on the leading (left-side) columns.** The exact rule is that **an equality constraint on leading columns** plus **the inequality (range) constraint on the first column that doesn't have an equality constraint** narrows the index range scanned.

In plain terms — **"the equality (`=`)-narrowing columns first, the range (`<`, `>`, `BETWEEN`)-narrowing column last."** Columns after the range column don't contribute to narrowing (limiting the scan range) and can only be used for row filtering.

```sql
-- クエリ: WHERE tenant_id = $1 AND status = $2 AND created_at >= $3 ORDER BY created_at
-- 正しい列順：等値2つ → 範囲1つ（範囲かつ ORDER BY 対象を最後に）
CREATE INDEX idx_orders_lookup
  ON orders (tenant_id, status, created_at);
```

This index narrows the range with `tenant_id = ? AND status = ?` and handles a range scan with `created_at` plus providing the sort order. Conversely, in the order `(created_at, tenant_id, status)`, the equality on `tenant_id`/`status` can't narrow the scan range.

> **The official admonition**: "Use composite indexes sparingly. **An index of more than 3 columns is rarely useful.**" Rather than adding column after column, holding a few 2–3 column indexes matched to the workload works better.

### PostgreSQL 18: skip scan eases the column-order constraint

In PG18, **B-tree skip scan** came in, increasing the cases where a composite index can be used even "without an equality constraint on the leading column." The official explanation:

> When a B-tree index scan can apply the skip-scan optimization, it applies the constraints on all columns while iteratively probing the index, **even without an equality constraint on the leading column.** This can shrink the index range to read.

For example, with an index on `(x, y)`, even `WHERE y = 7700` (no condition on `x`) can leverage the index with dynamic equality probes for each value of `x`. **But it works more the lower `x`'s cardinality (number of distinct values).** A composite index with a low-cardinality column at the front rose in value in PG18.

### Match the sort order (ASC/DESC, NULLS) too

B-tree's default is **ascending, NULLS LAST** (`ASC` is `NULLS LAST`, `DESC` is `NULLS FIRST`). The issue is an **`ORDER BY` mixing ascending and descending per column.**

```sql
-- クエリ: ORDER BY priority ASC, created_at DESC
-- 既定の (priority, created_at) では DESC 混在を索引で満たせない。混在順を明示する
CREATE INDEX idx_tasks_board
  ON tasks (priority ASC, created_at DESC);
```

If a mixed-order `ORDER BY` is frequent, build the index in that order too. Otherwise the default is enough.

---

## 3. Covering indexes (INCLUDE) and index-only scan

A normal index scan is the two-stage "locate the row's position with the index → read the heap (the body) to get the column values." An **index-only scan** can skip reading the heap if all needed columns are in the index, greatly reducing I/O.

With the `INCLUDE` clause, you can bundle into the index "columns not used in search conditions but that you want to fetch" (**B-tree, GiST, SP-GiST only**).

```sql
-- WHERE customer_id = $1 で引いて id, amount を返すクエリを索引だけで完結させる
CREATE INDEX idx_orders_cust_covering
  ON orders (customer_id) INCLUDE (id, amount);
```

But there's **an important condition.** Per the official, even if an index-only scan holds, "**you actually gain only when a significant fraction of the table's heap pages have the 'all-visible' bit set.**"

Why. For MVCC, PostgreSQL manages "whether that row is visible to all transactions" with the **visibility map.** If the visibility map bit is set, visibility can be judged with the index alone, but if not, it goes to look at the heap after all (= no different from a normal index scan). What freshens the visibility map is **VACUUM.**

```sql
-- index-only scan が効いているかは EXPLAIN の "Heap Fetches" で分かる
EXPLAIN (ANALYZE) SELECT id, amount FROM orders WHERE customer_id = 42;
--  Index Only Scan ... Heap Fetches: 0   ← 0 が理想（ヒープを一度も見ていない）
```

> **The design implication**: a covering index shines on "**a table updated gently and well VACUUMed.**" Stuff payload columns into a heavily-updated table and the visibility map can't keep up, `Heap Fetches` increases, and the meaning thins out ([MVCC/VACUUM article](/blog/postgresql-mvcc-transaction-isolation-vacuum-autovacuum-guide)).

---

## 4. Partial indexes: narrow the index to "only the needed rows"

A partial index (with a `WHERE`) indexes **only the rows that meet the condition.** The index becomes smaller and the update cost also drops. Three standards the official lists:

**① Don't index common values**: a value that appears in several percent or more of rows won't have the index used anyway. Exclude it and the index becomes smaller.

```sql
-- 大半が status='done' のタスク表。未完了だけを索引（active な行は少数）
CREATE INDEX idx_tasks_open ON tasks (assignee_id)
  WHERE status <> 'done';
```

**② Exclude rows you don't care about**: exclude from the start rows that typical queries don't look at (logically deleted, etc.).

```sql
CREATE INDEX idx_users_active ON users (email)
  WHERE deleted_at IS NULL;
```

**③ Unique constraint on only a subset**: you can express "no duplicates only among the active rows."

```sql
-- 「成功した試行」の (subject, target) だけ一意にしたい
CREATE UNIQUE INDEX uq_tests_success
  ON tests (subject, target) WHERE success;

-- 実務例：1ユーザーにつき「有効なサブスクは1つだけ」を DB で保証する
CREATE UNIQUE INDEX uq_one_active_subscription
  ON subscriptions (user_id) WHERE status = 'active';
```

> ③ is a powerful pattern to **protect a business invariant with a DB constraint.** Without relying on an app if-statement, you can make "an illegal state unrepresentable" at the data layer.

---

## 5. Expression indexes: index the result of a computation

When you **process a column with a function to search** like `WHERE lower(email) = $1`, a plain `email` index can't be used. An expression index indexes the result of the computation.

```sql
-- 大小無視のメール検索を効かせる（式は二重括弧が必要なケースに注意）
CREATE INDEX idx_users_email_lower ON users (lower(email));
-- WHERE lower(email) = 'a@example.com' が索引を使う

-- 連結など関数呼び出し以外の式は二重括弧
CREATE INDEX idx_people_fullname
  ON people ((first_name || ' ' || last_name));
```

The trade-off is clear. Per the official, an expression index has "**high maintenance cost because the expression is recomputed on each INSERT and non-HOT update.**" On the other hand, **no recomputation at search time** (already stored). In other words, it's a tool to use when "**read speed > write speed.**"

---

## 6. Adding an index in production: CREATE INDEX CONCURRENTLY

A normal `CREATE INDEX` **locks writes to the target table.** Do this on a live production table and all INSERT/UPDATE/DELETE stop until the index is built — an accident.

`CONCURRENTLY` builds the index **without stopping writes.** It's an essential manner in production. But there are **three cautions** the official states.

1. **It scans twice so it's slow**: the total work is more than a normal build, taking far more time to complete.
2. **It can't be run inside a transaction block**: if the migration tool wraps each DDL in an implicit transaction, configuration is needed (below).
3. **On failure an INVALID index remains**: if it fails on a deadlock or unique-constraint violation, an "invalid index" remains. **`DROP INDEX` and rebuild**, or rebuild with `REINDEX INDEX CONCURRENTLY`.

```sql
-- 本番の稼働中テーブルに無停止で索引を追加
CREATE INDEX CONCURRENTLY idx_orders_status ON orders (status);

-- 失敗して INVALID が残っていないか確認
SELECT indexrelid::regclass AS index, indisvalid
FROM pg_index WHERE NOT indisvalid;

-- 無効な索引は消して再実行
DROP INDEX CONCURRENTLY IF EXISTS idx_orders_status;
```

An implementation example in a migration (each tool needs "run outside a transaction" specified).

```ts
// 例：node-pg-migrate / Drizzle などで CONCURRENTLY を使うときの考え方
// マイグレーションランナーの「トランザクションを張らない」オプションを有効にしたうえで実行する。
// （CONCURRENTLY はトランザクションブロック内では失敗するため）
export async function up(sql: (q: string) => Promise<unknown>) {
  await sql(`CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_orders_status ON orders (status)`);
}
```

---

## 7. Measure "is it working" and "is there waste"

An index isn't done once made. Periodically inventory **whether it's used** and **whether it's duplicated.**

```sql
-- 使われていない索引（idx_scan が極端に少ない＝削除候補）
SELECT
  relname        AS table,
  indexrelname   AS index,
  idx_scan       AS scans,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC, pg_relation_size(indexrelid) DESC
LIMIT 20;
```

An unused index is **pure debt** that only eats capacity, slows writes, and bloats. An index whose `idx_scan` is nearly 0 is a deletion candidate (excluding primary keys and unique constraints).

> **B-tree's fillfactor**: B-tree's default fillfactor is **90.** On a heavily-updated table there's the option of lowering it with `WITH (fillfactor = 70)`, etc., to ease page splits, but first "delete unneeded indexes" and "fix the column order" work better.

---

## 8. Index-design checklist

Before adding a new index, answer these questions.

1. **Is that query in the top of `pg_stat_statements`?** If not, you don't need an index yet.
2. **Is the predicate SARGable?** `WHERE lower(col) = …` or `WHERE col + 1 = …` won't use a plain index → an expression index or query fix.
3. **If composite, is the column order "equality → range"?**
4. **Can you aim for an index-only scan including the fetched columns?** (`INCLUDE` if the table is gently updated)
5. **If you can narrow rows, can it be a partial index?**
6. **If production, did you build it with `CONCURRENTLY`?**
7. **After making it, did you confirm it was actually used with `EXPLAIN (ANALYZE, BUFFERS)`?**

---

## 9. Conclusion

- **Choose the type correctly**: B-tree for equality/range/sorting, GIN for arrays/JSONB/full-text, BRIN for huge time series. 90% is enough with B-tree.
- **Composite is the "equality → range" column order.** 4+ columns is mostly waste. PG18's skip scan partially rescues the no-leading-column-constraint case.
- **Covering indexes (INCLUDE) + index-only scan** shine on a well-VACUUMed table (aim for `Heap Fetches: 0`).
- **With partial and expression indexes**, index only "the needed rows, the needed shape." Have the DB protect business invariants with a unique partial index.
- **Production addition is CONCURRENTLY.** Procedure-ize even the cleanup of the INVALID index on failure.
- **Periodic inventory** to delete unused indexes (pure debt).

An index is a design act of allocating the trade-off of "speed" vs. "writes, capacity, bloat" to fit the workload. Always confirm with [EXPLAIN](/blog/postgresql-explain-analyze-slow-query-optimization-guide) whether the index you made is genuinely working.

---

### References (PostgreSQL 18 official documentation)

- [11.2. Index Types](https://www.postgresql.org/docs/18/indexes-types.html)
- [11.3. Multicolumn Indexes (column order, skip scan)](https://www.postgresql.org/docs/18/indexes-multicolumn.html)
- [11.9. Index-Only Scans and Covering Indexes](https://www.postgresql.org/docs/18/indexes-index-only-scans.html)
- [11.8. Partial Indexes](https://www.postgresql.org/docs/18/indexes-partial.html)
- [11.7. Indexes on Expressions](https://www.postgresql.org/docs/18/indexes-expressional.html)
- [CREATE INDEX (CONCURRENTLY, fillfactor)](https://www.postgresql.org/docs/18/sql-createindex.html)
