# Vercel observability guide: trace production with Observability, Speed Insights, Web Analytics, Log Drains, and OTel

> An observability guide faithful to Vercel's official docs. It explains Observability (Insights for functions/edge/middleware/external APIs/ISR, etc.), Speed Insights (real-user CWV), Web Analytics (privacy-conscious), Runtime Logs, Log Drains, Monitoring/Notebooks, and OpenTelemetry integration, with @vercel/speed-insights and @vercel/analytics implementations and the 'measure → optimize' workflow.

- Published: 2026-06-28
- Author: 友田 陽大
- Tags: Vercel, 可観測性, Next.js, パフォーマンス, コスト最適化, TypeScript, SRE
- URL: https://tomodahinata.com/en/blog/vercel-observability-monitoring-speed-insights-log-drains-guide
- Category: Vercel in production
- Pillar guide: https://tomodahinata.com/en/blog/vercel-production-platform-guide

## Key points

- Vercel's observability is multi-layered — Observability (infra/app Insights), Speed Insights (real-user CWV), Web Analytics (cookieless access analytics), Runtime Logs/Log Drains (logs), Monitoring/Notebooks (dashboards/queries), OpenTelemetry (external-APM integration).
- Observability is free on all plans, visualizing Functions, External APIs, Edge Requests, Middleware, ISR, Image Optimization, AI Gateway, etc. by route. Observability Plus (Pro/Ent) adds latency, per-path breakdowns, and long-term retention.
- For Speed Insights and Web Analytics, install @vercel/speed-insights and @vercel/analytics and place the components. Measure real users' LCP/INP/CLS and access privacy-consciously without relying on cookies. Put both packages in package.json's dependencies.
- Failure investigation goes in the order of identifying the heavy route in Observability → checking the stack trace in Runtime Logs. Don't guess; trace from primary sources to the cause. Cost investigation is the same, identifying heavy routes by Invocation/Active CPU in Insights.
- Integrate with external APM (Datadog/Grafana, etc.) via Log Drains and OpenTelemetry. Propagate the deployment ID and you can do Rolling Releases canary comparison with your own metrics.

---

"Production is slow / throwing errors. But I can't tell which route is the cause" — without observability, both failure investigation and cost optimization become **guesswork.** Vercel comes with multi-layered observability as standard, so you can **mechanically trace "from symptom to cause."**

This article summarizes each layer of observability and the "measure → optimize" workflow, faithful to the official specs of [Vercel Observability](https://vercel.com/docs/observability). For the full picture, see the [Vercel production-operations guide](/blog/vercel-production-platform-guide).

---

## The map of observability: five layers

| Layer | What it tells you | Enablement |
|---|---|---|
| **Observability** (Insights) | Per-route metrics for functions/edge/middleware/external APIs/ISR/image optimization, etc. | Standard (free on all plans) |
| **Speed Insights** | Real users' CWV (LCP/INP/CLS) | `@vercel/speed-insights` |
| **Web Analytics** | Access analytics (cookieless, privacy-conscious) | `@vercel/analytics` |
| **Runtime Logs / Log Drains** | Function logs / external forwarding | Standard / Log Drains |
| **Monitoring / Notebooks / OTel** | Dashboards, alerts, external-APM integration | Observability Plus / OpenTelemetry |

---

## Observability: what's happening per route

The Observability tab visualizes requests **along the app's structure.** Free on all plans, with **Observability Plus** (Pro/Ent) adding latency, per-path breakdowns, and long-term retention.

Insights visualized (a subset):

- **Vercel Functions**: invocation count, error rate, execution time (per route)
- **External APIs**: latency and failures of external API calls
- **Edge Requests / Middleware**: edge and middleware behavior
- **ISR / Image Optimization / Fast Data Transfer / AI Gateway / Queues / Blob**: usage of each feature

Events Vercel tracks: Edge Requests, Function Invocations, External API Requests, Routing Middleware Invocations, AI Gateway Requests. **One request can become multiple events** (e.g. 1 Edge Request + 1 Middleware + 1 Function + 2 External API = 5 events) — this is also the cost breakdown ([cost optimization](/blog/vercel-cost-active-cpu-pricing-optimization-guide)).

### The failure-investigation workflow (don't guess)

```
①Observability で対象機能（例: Functions）と期間を選ぶ
  ↓
②Error Rate / Duration のグラフでスパイクを特定、ズームイン
  ↓
③ルート一覧をエラー率 or 遅延で並べ替え、犯人ルートを特定
  ↓
④ルートをクリック → Runtime Logs へ直行 → スタックトレース確認
  ↓
⑤原因を修正（[トラブルシューティング](/blog/vercel-troubleshooting-build-function-errors-timeout-guide) 参照）
```

Tracing from primary sources — "symptom (error rate, latency) → route → logs → cause" — is the royal road of SRE ([principles of observability/SRE](/blog/opentelemetry-observability-production-tracing-metrics-logs)).

---

## Speed Insights: real users' Core Web Vitals

Rather than synthetic measurement (Lighthouse), it measures **real users' perceived experience** (LCP/INP/CLS). Just install `@vercel/speed-insights` and place the component.

```tsx
// app/layout.tsx（Next.js App Router）
import { SpeedInsights } from "@vercel/speed-insights/next";

export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <html lang="ja">
      <body>
        {children}
        <SpeedInsights />
      </body>
    </html>
  );
}
```

CWV are both a score and a **SEO ranking factor.** Look at real-user values and improve slow pages with [CWV optimization](/blog/core-web-vitals-nextjs-inp-lcp-cls-optimization-guide) (INP/LCP/CLS).

---

## Web Analytics: cookieless access analytics

With `@vercel/analytics`, you get **privacy-conscious access analytics without relying on cookies.** Suited to GDPR-conscious sites.

```tsx
// app/layout.tsx
import { Analytics } from "@vercel/analytics/next";

export default function RootLayout({ children }: { children: React.ReactNode }) {
  return (
    <html lang="ja">
      <body>
        {children}
        <Analytics />
      </body>
    </html>
  );
}
```

> **A note on both packages**: always put `@vercel/speed-insights` and `@vercel/analytics` in **package.json's dependencies.** Installing them globally and referencing them causes build errors, especially in a monorepo ([troubleshooting](/blog/vercel-troubleshooting-build-function-errors-timeout-guide)).

---

## Runtime Logs and Log Drains

- **Runtime Logs**: check functions' and middleware's `console.*` output in the dashboard. The primary source for failure investigation.
- **Log Drains**: forward logs to **Datadog, Grafana, your own collection platform**, etc. For long-term retention, cross-cutting analysis, and integration with existing APM.

> **Structured logs**: rather than `console.log("text")`, emitting **JSON structured logs** that include a correlation ID, route, and user (PII-masked) makes searching and aggregating at the Log Drains destination orders of magnitude easier. Not emitting PII is essential by design.

```ts
// 構造化ログの最小形（相関IDで追える）
function log(level: string, msg: string, ctx: Record<string, unknown>) {
  console.log(JSON.stringify({ level, msg, ts: Date.now(), ...ctx }));
}
log("info", "order.created", { orderId, region: process.env.VERCEL_REGION });
```

---

## Monitoring, Notebooks, OpenTelemetry

- **Monitoring**: build dashboards and alerts on top of metrics.
- **Notebooks**: save and organize Observability queries.
- **OpenTelemetry**: with OTel-collector integration, send traces and metrics to external APM. Introduce **distributed tracing** and you can correlate the chain of Vercel function → external API → DB.

> **Rolling Releases × external metrics**: when you want to make canary decisions in your own APM, **propagate the deployment ID to the external observability system.** This lets you distinguish canary and base metrics, so you can make the promotion decision for [Rolling Releases](/blog/vercel-deployments-cicd-rollback-rolling-releases-guide) with your own numbers.

---

## Run "measure → optimize"

Observability isn't "install and done"; it's **the starting point of an improvement loop.**

| Purpose | Where to look | Next action |
|---|---|---|
| **Slow** | Speed Insights (CWV) / Functions Duration | [CWV optimization](/blog/core-web-vitals-nextjs-inp-lcp-cls-optimization-guide), [caching](/blog/vercel-caching-isr-cache-components-ppr-guide) |
| **Errors** | Error Rate → Runtime Logs | [Troubleshooting](/blog/vercel-troubleshooting-build-function-errors-timeout-guide) |
| **Expensive** | Functions Insights (Invocation/Active CPU) | [Cost optimization](/blog/vercel-cost-active-cpu-pricing-optimization-guide) |
| **Bots/attacks** | Firewall tab | [WAF/BotID](/blog/vercel-firewall-waf-botid-ddos-security-guide) |

---

## Production checklist (observability)

- [ ] Monitor functions, error rate, and latency per route with **Observability**
- [ ] Measure real-user CWV with **Speed Insights** (directly tied to SEO)
- [ ] Cookieless access analytics with **Web Analytics**
- [ ] Put `@vercel/speed-insights` / `@vercel/analytics` in **package.json dependencies**
- [ ] Emit **structured logs (JSON, correlation ID, PII masking)**
- [ ] Integrate with external APM via **Log Drains / OpenTelemetry**
- [ ] Detect budget/failures with **Spend Management** and alerts
- [ ] Run "measure → optimize" as a **loop**

---

## Summary

Observability isn't "emitting logs"; it's creating **a state where you can trace stopped processing, slow pages, and expensive routes at a glance.**

1. Identify symptoms (errors/latency/cost) per route with **Observability**
2. Real users' experience and behavior with **Speed Insights / Web Analytics**
3. Primary sources and external integration with **Runtime Logs / Log Drains**
4. Distributed tracing with **OpenTelemetry**, canary comparison with the deployment ID
5. Put it all into a "**measure → optimize**" loop

I take on, as a project, the design of observability (structured logs, SLOs, alerts, external-APM integration, cost monitoring).

> This article is based on the [Vercel Observability](https://vercel.com/docs/observability) official documentation (as of June 2026). Features and plan conditions are updated, so confirm the latest values officially at production adoption.
