# pgvector vs dedicated vector DBs (Pinecone / Qdrant / Weaviate / Milvus): an in-depth comparison and tech-selection guide

> Which vector-search foundation should you pick? This compares pgvector (a PostgreSQL extension) against dedicated vector DBs (Pinecone, Qdrant, Weaviate, Milvus, Chroma) across seven axes — operational load, transactional consistency, scale ceiling, latency, metadata filtering, cost, and lock-in. Including a scaling strategy with pgvectorscale (StreamingDiskANN), it's a tech-selection guide to support the decisions of buyers and architects.

- Published: 2026-06-26
- Author: 友田 陽大
- Tags: PostgreSQL, RAG, Pinecone, アーキテクチャ設計, コスト最適化
- URL: https://tomodahinata.com/en/blog/pgvector-vs-pinecone-qdrant-weaviate-milvus-vector-database-comparison-guide
- Category: Generative AI, LLMs & RAG
- Pillar guide: https://tomodahinata.com/en/blog/vercel-ai-sdk-production-llm-apps-streaming-tools-rag

## Key points

- The question isn't 'which is the most powerful' but 'which fits my constraints.' The decision axes are seven: operational load, consistency with business data, scale ceiling, latency SLA, filtering, cost, and lock-in.
- If you already run Postgres, your vectors are in the millions to tens of millions of rows, and consistency between embeddings and business data matters, pgvector is the leading candidate first. It adds not a single new piece of infrastructure.
- For hundreds of millions to billions of vectors, strict latency SLAs at high concurrency, or a need for GPU/specialized indexes, a dedicated DB (Milvus/Pinecone) has the edge. Qdrant is the OSS middle ground, strong at filtered search.
- pgvector's scale ceiling can be pushed up with pgvectorscale (StreamingDiskANN, disk-resident). Treat every performance number as a vendor claim and measure on your own data.
- Pinecone is zero-ops but proprietary, with the largest lock-in. OSS options (Qdrant/Milvus/Weaviate/Chroma/pgvector) have low exit costs. Evaluate lock-in on day one.

---

When you start building RAG or semantic search, the first big tech-selection question is **"which vector-search foundation should I pick?"** Do you get by with `pgvector` (a PostgreSQL extension), or do you bring in a **dedicated vector DB** like Pinecone, Qdrant, Weaviate, or Milvus?

This article is a tech-selection guide for making that call **with axes, not gut feel**. The bottom line up front: **the question "which is the most powerful" does not exist.** All that exists is "**which fits your constraints (scale, consistency requirements, ops capacity, budget)**." This piece organizes each product's characteristics and the decision axes, grounded in official sources, so buyers and architects can make this decision without regret.

> **Rules for this article**: pgvector's specs are based on its **official documentation**. Each dedicated DB's characteristics are based on **each vendor's official sources**, but **all performance numbers (QPS, latency, cost) are treated as "vendor claims"** and are not presented as neutral benchmarks (performance swings wildly with data, dimensionality, recall target, and hardware). Product specs change fast, so **always confirm against the latest official sources before adopting**.

---

## 1. The bottom line first: a decision flowchart

Before the fine-grained comparison, **80% of projects are decided by these few questions**.

```text
Q1. Do you already run PostgreSQL?
   ├─ No  → Adding Postgres "just for vectors" is backwards.
   │        For a prototype consider Chroma; for production, Qdrant / Pinecone.
   └─ Yes ↓

Q2. Do embeddings need to stay consistent with business data (users, orders, documents)?
   │   (= when you delete a document, its vector should be deleted at the same time, etc.)
   ├─ Yes → pgvector is extremely favorable (same transaction, same backup).
   └─ Either way ↓

Q3. How large will the vectors be for the foreseeable future?
   ├─ Millions to tens of millions of rows  → pgvector can compete fine. First candidate.
   ├─ ~50 million to 100M+                   → consider pgvector + pgvectorscale (StreamingDiskANN).
   └─ Hundreds of millions to billions, high concurrency → a dedicated DB
       (Milvus distributed / Pinecone serverless) has the edge.

Q4. Is there a strict latency SLA at high-concurrency QPS, "independent of the app DB"?
   ├─ Yes → lean to a dedicated DB (in-memory HNSW / Pinecone DRN / Qdrant).
   └─ No  → unlikely to be a problem with pgvector.
```

**The shortest conclusion**: "**you already have Postgres, you're up to tens of millions of rows, and consistency matters**" — for this, the majority case for B2B SaaS and internal tools, **pgvector is the first candidate**. Because you can start without adding a single new piece of infrastructure (KISS / YAGNI).

---

## 2. The seven decision axes for tech selection

"Which fits" comes into view once you evaluate the following axes with **your own project's weighting**.

### ① Operational simplicity
If you already run Postgres, pgvector adds **zero new infrastructure, monitoring, backups, or on-call surface**. That's the biggest lever. A dedicated DB (self-hosted) means committing to operating "one more system." Managed offerings (Pinecone / Zilliz / Qdrant Cloud, etc.) carry the ops for you, but **add a vendor and a monthly bill**.

### ② Transactional consistency with business data
This is pgvector's decisive strength. You can update **embeddings and business rows in the same ACID transaction**, JOIN in SQL, and keep backups consistent. A dedicated DB is a separate system, so you need a **sync pipeline between business DB ↔ vector DB**, and that's the breeding ground for staleness incidents where "information you thought you deleted still shows up in search." If consistency is a requirement, pgvector has an overwhelming edge.

### ③ Scale ceiling
*This is a direction, not a hard threshold (it's workload-dependent).*
- **pgvector / + pgvectorscale**: millions to tens of millions with room to spare. With pgvectorscale's disk-resident index and tuning, some cases push to **50 million to over 100 million**.
- **Dedicated DBs (Milvus distributed, Pinecone serverless, Qdrant clusters)**: handling **hundreds of millions to billions** via horizontal sharding is their core job. Here, dedicated DBs are clearly ahead.

### ④ Latency SLA
If you must hold **tight p95/p99** at high-concurrency QPS, and do so **independently of the app DB's load**, an in-memory HNSW dedicated engine (Pinecone DRN / Qdrant, etc.) has the edge. pgvector is plenty competitive at mid scale, but you should be conscious that **the same instance also handles OLTP**.

### ⑤ Metadata filtering capability
Here differences emerge. How fast can you do "semantic search over only this tenant's, only published documents"? **Qdrant (filterable HNSW)** and **pgvectorscale (Filtered/StreamingDiskANN)** apply filters *during* the ANN search — a pre/streaming filter — to avoid the "not enough results once filtered" problem. Plain pgvector has **the most expressive power** through arbitrary SQL `WHERE` and JOINs, but high-selectivity filters depend on index design. If complex, high-cardinality filters are central, weigh Qdrant / pgvectorscale heavily.

### ⑥ Cost model
- **pgvector**: the **marginal cost** inside your existing Postgres (no new invoice; you pay in RAM/CPU/disk and DBA effort).
- **Managed dedicated DBs**: a **predictable monthly/usage fee**, but pricey at scale plus egress/lock-in.
- **Self-hosted OSS (Qdrant/Milvus/Weaviate/Chroma)**: zero license fee; you pay in infrastructure + ops labor.

### ⑦ Migration cost / lock-in
**Pinecone is proprietary (closed-source)**, and moving hundreds of millions of vectors is non-trivial = **the largest exit cost**. By contrast, **pgvector (PostgreSQL License), Qdrant/Milvus/Chroma (Apache 2.0), and Weaviate (BSD-3)** are OSS — self-hosting and migration are possible. Lock-in is an axis you should evaluate **on day one**.

---

## 3. Comparison table: pgvector and the major dedicated vector DBs

Each product's **profile** on one page. Read it as a difference in "design philosophy and arena," not in performance superiority.

| Product | Deployment form | Main indexes | License | Standout strength | Main trade-off | Sweet-spot scale / target |
| --- | --- | --- | --- | --- | --- | --- |
| **pgvector** | Postgres extension (anywhere) | HNSW / IVFFlat | PostgreSQL License | Consolidates into existing Postgres; **same transaction as business data**; plain SQL | Inferior to dedicated DBs at very large scale / high-concurrency SLA | Millions to tens of millions (+scale to ~100M). Postgres ops teams |
| **Pinecone** | Managed dedicated (cloud only) | Proprietary serverless (storage/compute separation) | Proprietary (non-OSS) | **Zero-ops**; scales to billions serverlessly | **Largest lock-in**; no self-hosting; internals undisclosed | Want zero ops; large scale; SaaS acceptable |
| **Qdrant** | OSS self-host + Cloud | Proprietary HNSW (**filterable**, written in Rust) | Apache 2.0 | **Best-in-class filtered ANN**; high efficiency; no license traps | Horizontal scaling at large scale is ops work | Filter-heavy RAG/search/recommendation. OSS-minded |
| **Weaviate** | OSS self-host + Cloud | HNSW / flat / dynamic | BSD-3-Clause | **Integrated object + vector**; built-in embedding/hybrid-search modules | Heavier and more opinionated than pure ANN | Teams who want an integrated "AI-native DB" all in one |
| **Milvus** | OSS self-host (Lite/standalone/distributed) + Zilliz Cloud | **The most** (HNSW/IVF/DiskANN/SCANN/GPU) | Apache 2.0 | **Largest scale**; widest index choices; GPU support | Distributed mode is heavy to operate (K8s, many components) | Hundreds of millions to billions, true large scale. Ops-mature teams |
| **Chroma** | Embedded/standalone/distributed + Cloud | HNSW-based | Apache 2.0 | **Fastest to prototype** (`import chromadb` and go) | Track record at very large scale is newer | PoC, small-to-mid RAG, local development |

> The "deployment form," "license," and "indexes" columns are **facts** sourced from each vendor's official material. "Strength/trade-off" is a summary of design philosophy; **confirm the superiority on your workload by measuring**.

---

## 4. pgvector's "scale ceiling" is not fixed

The assumption that "pgvector is for small scale only" is outdated in 2026. The ceiling is pushed up by **the Postgres ecosystem's extensions**.

### pgvectorscale (Timescale / TigerData)
It adds the following to plain pgvector (**OSS under the PostgreSQL License**).

- **StreamingDiskANN index**: derived from Microsoft's DiskANN. Because it puts **part of the index on disk**, it **scales more cost-effectively as the number of vectors grows** than HNSW's fully in-memory approach.
- **Label-based / streaming filtered search**: filtering that keeps fetching until enough results are gathered, avoiding the "not enough results once filtered" problem.
- **Statistical Binary Quantization**: an improved version of standard BQ.

> **Read performance as a vendor claim**: Timescale publishes figures like "50 million Cohere embeddings, 99% recall, with 28× lower p95 latency, 16× higher throughput, and 75% lower cost vs Pinecone's storage-optimized (s1) (self-hosted on EC2)," but this is a **first-party comparison on their own conditions and a single dataset**. Take it only in the form **"Timescale reports this,"** and always verify on your own data.

### VectorChord and others
**VectorChord** (TensorChord, the successor to pgvecto.rs) is a Postgres extension that aims for "disk-friendly, low-cost large scale" with **IVF + RaBitQ quantization**. It makes claims like "index builds 100× faster than pgvector," but treat this too as a **vendor claim**.

**The point**: choosing pgvector does not mean "being bound to small scale forever." **Start with plain pgvector → move to pgvectorscale / VectorChord when scale arrives, all while staying inside Postgres** — this is the reassurance on the scaling side.

---

## 5. Honestly: cases where pgvector is "not a fit"

I recommend consolidating into pgvector for many projects, but **it's not a silver bullet**. In the following situations, I consider a dedicated DB head-on.

- **Hundreds of millions to billions of vectors × sub-millisecond SLA × high concurrency**: the arena of **Milvus (distributed) / Pinecone (serverless)**, where horizontal sharding is a first-class feature.
- **GPU inference or specialized indexes (PQ/SCANN/DiskANN tuning) are essential**: **Milvus**'s breadth of indexes pays off.
- **You don't run Postgres in the first place**: the consolidation benefit (unified ops) doesn't materialize. **Adding Postgres just for vectors is backwards**, and in that case **Qdrant** (OSS, strong filtering) or **Pinecone** (zero-ops) is more natural.
- **High-cardinality, complex filters are at the heart of performance**: weigh **Qdrant** or **pgvectorscale** heavily.

The most honest thing in tech selection is to **look squarely at "where your product is heading."** Splitting today across two data stores for billions that never come, or pretending not to see a large scale that is clearly coming — both are failures.

---

## 6. Summary: a selection cheat sheet

- **How to frame the question**: not "which is the most powerful" but "**which fits my constraints**." The axes are operational load, consistency, scale ceiling, latency, filtering, cost, and lock-in.
- **Already on Postgres + up to tens of millions of rows + consistency matters** → **pgvector** (zero added infra, same transaction). Most B2B SaaS is here.
- **Hundreds of millions to billions + high-concurrency SLA + GPU/special index** → **Milvus / Pinecone**.
- **Filtered search is the star, OSS-minded** → **Qdrant**. **Integrated AI-native DB** → **Weaviate**. **Fastest PoC** → **Chroma**.
- **pgvector's scale** can be pushed up with **pgvectorscale (StreamingDiskANN)**. **Performance numbers are vendor claims = measure on your own data**.
- **Lock-in**: Pinecone (proprietary) is the largest; the various OSS options have low exit costs. **Evaluate on day one**.

In [the generative-AI voice chatbot](/case-studies/ai-voice-chatbot), I made the call to **consolidate business data and embeddings into PostgreSQL + pgvector rather than add a dedicated vector DB**. That's because I prioritized the operational simplicity of handling semantic search over product documents (PDF/Excel/image/video) in the **same DB and same transaction** as the business data. On the other hand, had the requirement been billions of vectors or an independent low-latency SLA, I'd propose a dedicated DB without hesitation — because **selection depends on requirements**.

**"Is pgvector enough, or should you bring in a dedicated vector DB?" — let's determine that first move together, from your scale, consistency requirements, budget, and team setup.** Feel free to reach out even at the requirements-gathering stage. When you move into implementation, start with [getting started with pgvector](/blog/pgvector-getting-started-installation-docker-supabase-rds-neon-guide); for serious RAG, [production RAG design](/blog/pgvector-postgres-production-rag-hybrid-search); for speed/cost optimization, the [complete tuning guide](/blog/pgvector-index-tuning-hnsw-ivfflat-quantization-iterative-scan-guide).

---

### References (official / primary sources)

- [pgvector (GitHub)](https://github.com/pgvector/pgvector) (PostgreSQL License, HNSW/IVFFlat) / [pgvectorscale (Timescale)](https://github.com/timescale/pgvectorscale) (StreamingDiskANN, label filtering; performance is a vendor claim)
- [Pinecone](https://www.pinecone.io/) / [Qdrant (GitHub, Apache 2.0)](https://github.com/qdrant/qdrant) / [Weaviate (GitHub, BSD-3)](https://github.com/weaviate/weaviate) / [Milvus (GitHub, Apache 2.0)](https://github.com/milvus-io/milvus) / [Chroma (GitHub, Apache 2.0)](https://github.com/chroma-core/chroma)
- Comparative performance/cost claims are based on each vendor's published figures (e.g. [Timescale's pgvector vs Pinecone article](https://www.tigerdata.com/blog/pgvector-vs-pinecone)); note these are **not neutral benchmarks**.
