# Echo observability: implementing distributed tracing, metrics, and slog correlation with custom middleware using OpenTelemetry

> A guide to implementing Go Echo (v5) observability at production quality with OpenTelemetry. Given that otelecho is deprecated and assumes v4, it explains with real code: a version-independent custom trace middleware, trace propagation via context (DB and outbound HTTP), metrics such as the request-duration histogram, trace_id correlation into slog logs, and OTLP export.

- Published: 2026-06-28
- Author: 友田 陽大
- Tags: Go, Echo, 可観測性, アーキテクチャ設計, 型安全, SRE
- URL: https://tomodahinata.com/en/blog/go-echo-opentelemetry-distributed-tracing-metrics-observability-guide
- Category: Go & Echo in production
- Pillar guide: https://tomodahinata.com/en/blog/go-echo-framework-production-guide

## Key points

- The purpose of observability is to 'trace a stalled process at a glance.' Correlate traces, metrics, and logs by trace_id so one request can be followed end-to-end.
- otelecho is deprecated and assumes Echo v4. On v5, custom middleware that uses the OpenTelemetry SDK directly is more robust because it is version-independent (ETC).
- For traces, propagation is everything. Extract context from the incoming header → start a span → c.SetRequest(c.Request().WithContext(ctx)) so it threads through downstream (DB / external API).
- Metrics follow RED (Rate/Errors/Duration). Keep a minimal setup: a Histogram of request duration and an UpDownCounter of in-flight requests.
- Putting trace_id/span_id on slog lets you jump from one log line to the trace. The key is correlating RequestLogger(slog) with OTel.

---

When someone says "the API is slow in production," can you instantly answer **which process, by how much, and why** it is slow? An operation where you `grep` logs and guess wastes time on every incident. Observability is creating a state where you can **trace a stalled or slow process at a glance — with data, not guesses.**

This article is the observability chapter of the [Go Echo production-operations guide](/blog/go-echo-framework-production-guide). We take traces and metrics with OpenTelemetry (OTel) and correlate them with [slog structured logs](/blog/go-echo-middleware-cors-csrf-jwt-rate-limit-security-guide#4-requestloggerslogで構造化アクセスログ). Platform-wide observability design is left to the [OpenTelemetry practical guide](/blog/opentelemetry-observability-production-tracing-metrics-logs); here we focus on an implementation that **works right now on Echo v5.**

> **Rules for this article**: Echo's API is based on the **official documentation (v5, as of June 2026).** **Important**: the once-standard `otelecho` (`go.opentelemetry.io/contrib/.../labstack/echo/otelecho`) is **deprecated** and **assumes Echo v4.** This article adopts **custom middleware that uses the OTel SDK directly**, independent of it ([the reasoning is in chapter 1](#1-why-custom-middleware-instead-of-otelecho)). The OTel SDK is updated, so confirm the latest API in the official docs.

---

## 0. The three pillars: "correlate" traces, metrics, and logs

Observability is built from three signals. The value is in **correlating them, not collecting them separately.**

- **Traces**: a breakdown of "which process, in what order, and how long" one request took. Effective for pinpointing the culprit of latency.
- **Metrics**: aggregate values (request count, error rate, duration distribution). Effective for trends and alerts.
- **Logs**: the detail of individual events. Effective for the context of the cause.

When you **tie these together by `trace_id`**, you get a **single investigative line**: "notice the error rate rising in metrics → identify the slow span in traces for that time window → jump to the logs by that span's `trace_id` and read the cause." This is the goal of the article.

---

## 1. Why custom middleware instead of otelecho

The standard is the `otelecho` middleware, but as of June 2026 there are **two problems.**

1. **Deprecated**: the package itself has been deprecated.
2. **Assumes Echo v4**: in v5 the handler signature changed to `func(c *echo.Context) error`, so v4-premised instrumentation doesn't mesh as-is.

OpenTelemetry's **core SDK (`go.opentelemetry.io/otel`) is framework-independent.** So having thin instrumentation that **calls the OTel SDK directly as Echo middleware** yourself is **unaffected by versions, not dragged into deprecation, and understandable inside** — a robust choice in terms of ETC (ease of change). The code is only a few dozen lines.

---

## 2. Trace-instrumentation middleware: propagation is everything

The crux of distributed tracing is **context propagation.** You **Extract** the parent trace context from the incoming request's header, start a span, and **always carry that `ctx` to downstream (DB / external API)** — if this is cut, the trace becomes fragmented.

```go
import (
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/attribute"
	"go.opentelemetry.io/otel/codes"
	"go.opentelemetry.io/otel/propagation"
	semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
	"go.opentelemetry.io/otel/trace"
)

func OTelTracing(service string) echo.MiddlewareFunc {
	tracer := otel.Tracer(service)
	propagator := otel.GetTextMapPropagator()

	return func(next echo.HandlerFunc) echo.HandlerFunc {
		return func(c *echo.Context) error {
			req := c.Request()
			// ① 受信ヘッダから親トレース文脈を取り出す（W3C traceparent 等）
			ctx := propagator.Extract(req.Context(), propagation.HeaderCarrier(req.Header))

			// ② span を開始。ルートパターンを名前にする（カーディナリティを抑える）
			route := c.Path() // "/users/:id"（実値ではなくパターン＝低カーディナリティ）
			ctx, span := tracer.Start(ctx, req.Method+" "+route,
				trace.WithSpanKind(trace.SpanKindServer),
				trace.WithAttributes(
					semconv.HTTPRequestMethodKey.String(req.Method),
					semconv.HTTPRouteKey.String(route),
				),
			)
			defer span.End()

			// ③ 後続（ハンドラ→DB→外部API）へ ctx を貫通させる（最重要）
			c.SetRequest(req.WithContext(ctx))

			err := next(c)

			// ④ 結果を span に記録
			status := c.Response().Status
			span.SetAttributes(semconv.HTTPResponseStatusCodeKey.Int(status))
			if err != nil || status >= 500 {
				span.SetStatus(codes.Error, http.StatusText(status))
				if err != nil {
					span.RecordError(err)
				}
			}
			return err
		}
	}
}
```

**Key design points**:

- **The span name is `c.Path()` (the route pattern).** If you name it with an actual value like `/users/42`, each ID is treated as a separate span and **cardinality explodes.** Normalize to `/users/:id`.
- **`c.SetRequest(req.WithContext(ctx))`** is the heart of propagation. Forget it and the span doesn't ride on `c.Request().Context()` inside the handler, so the child spans of the [DB query](/blog/go-echo-database-postgresql-pgx-sqlc-gorm-transaction-guide#1-最重要contextをハンドラからdbまで貫通させる) and the external API **won't connect to the parent.**
- Place it **inside `Recover`** so that panics are also recorded to the span via the [centralized error handler](/blog/go-echo-request-binding-validation-error-handling-guide).

---

## 3. Child spans: thread DB and external HTTP into one line

If you pass down the `ctx` created in the middleware, the lower-level processes hang off the parent as **child spans.** This **visualizes** "the API is fast but the DB is slow."

```go
// DB：pgx なら otelpgx で自動計装、または手動で子 span
func (r *UserRepo) FindByID(ctx context.Context, id string) (*User, error) {
	ctx, span := otel.Tracer("repo").Start(ctx, "UserRepo.FindByID") // 親 ctx から子 span
	defer span.End()
	// ... r.pool.Query(ctx, ...) ← ctx 経由で DB span が親に繋がる
}

// 外部 HTTP：otelhttp はフレームワーク非依存なのでそのまま使える
import "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"

client := &http.Client{Transport: otelhttp.NewTransport(http.DefaultTransport)}
// client.Do(req.WithContext(ctx)) ← 送信先へ traceparent を自動伝播
```

> `otelhttp` (outbound HTTP instrumentation) just wraps the `net/http` transport, so it **does not depend on Echo's version.** pgx also has instrumentation libraries like `otelpgx`. "**Use framework-independent instrumentation, and replace framework-tightly-coupled instrumentation (otelecho) with your own**" is the robust policy for the transition period.

---

## 4. Metrics: RED in a minimal setup

Hold metrics based on **RED (Rate, Errors, Duration).** The minimal setup is a **Histogram of request duration** and an **UpDownCounter of in-flight requests.** Add these to the same middleware.

```go
import "go.opentelemetry.io/otel/metric"

func OTelMetrics(service string) echo.MiddlewareFunc {
	meter := otel.Meter(service)
	duration, _ := meter.Float64Histogram("http.server.request.duration",
		metric.WithUnit("s"), metric.WithDescription("HTTP request duration"))
	inflight, _ := meter.Int64UpDownCounter("http.server.active_requests")

	return func(next echo.HandlerFunc) echo.HandlerFunc {
		return func(c *echo.Context) error {
			ctx := c.Request().Context()
			start := time.Now()
			inflight.Add(ctx, 1)
			defer inflight.Add(ctx, -1)

			err := next(c)

			// 属性はルートパターン＋ステータスクラスに絞る（カーディナリティ管理）
			attrs := metric.WithAttributes(
				attribute.String("http.route", c.Path()),
				attribute.String("http.method", c.Request().Method),
				attribute.Int("http.status_code", c.Response().Status),
			)
			duration.Record(ctx, time.Since(start).Seconds(), attrs)
			return err
		}
	}
}
```

> **The cardinality trap**: putting a **user ID or raw URL** in a metric's attributes explodes the combinations of time series, destroying cost and storage (directly tied to cost efficiency). Strictly limit attributes to **low cardinality** like the route pattern, method, and status. For SLO / error-budget design, go to the [observability / SRE practice](/blog/opentelemetry-observability-production-tracing-metrics-logs).

---

## 5. Log correlation: put trace_id on slog

The last piece is **correlating logs and traces.** Put **`trace_id`/`span_id`** on the [v5-standard slog](/blog/go-echo-middleware-cors-csrf-jwt-rate-limit-security-guide#4-requestloggerslogで構造化アクセスログ) and you can **jump from one log line to the trace.** Wire in a helper that pulls the span context out of `ctx`.

```go
// ctx の span 文脈を slog 属性に変換する
func traceAttrs(ctx context.Context) []slog.Attr {
	sc := trace.SpanContextFromContext(ctx)
	if !sc.IsValid() {
		return nil
	}
	return []slog.Attr{
		slog.String("trace_id", sc.TraceID().String()),
		slog.String("span_id", sc.SpanID().String()),
	}
}

// RequestLogger の LogValuesFunc で相関ログを出す
e.Use(middleware.RequestLoggerWithConfig(middleware.RequestLoggerConfig{
	LogStatus: true, LogURI: true, LogError: true, LogLatency: true, HandleError: true,
	LogValuesFunc: func(c *echo.Context, v middleware.RequestLoggerValues) error {
		ctx := c.Request().Context()
		attrs := append(traceAttrs(ctx),
			slog.String("uri", v.URI),
			slog.Int("status", v.Status),
			slog.Duration("latency", v.Latency),
		)
		level := slog.LevelInfo
		if v.Error != nil {
			level = slog.LevelError
			attrs = append(attrs, slog.String("err", v.Error.Error()))
		}
		logger.LogAttrs(ctx, level, "REQUEST", attrs...)
		return nil
	},
}))
```

With this, the **investigation that threads through the three pillars** holds: notice a spike in the error rate via metrics → find the slow span in the traces for that time window → pull the logs by that `trace_id`. Place `RequestLogger` **inside** the OTel middleware in the [middleware ordering](/blog/go-echo-middleware-cors-csrf-jwt-rate-limit-security-guide#2-並び順本番の推奨スタック) so logs are emitted with the span context already on them.

---

## 6. Export: send to the collection backend with OTLP

The instrumented signals are sent to the collection backend (OpenTelemetry Collector → Grafana Tempo / Jaeger / Datadog / each cloud) via **OTLP (OpenTelemetry Protocol).** Configure the TracerProvider/MeterProvider at app startup and **flush** on [graceful shutdown](/blog/go-echo-deployment-docker-distroless-ecs-cloud-run-graceful-shutdown-guide#4-グレースフルシャットダウンsigtermを取りこぼさない).

```go
import (
	"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc"
	"go.opentelemetry.io/otel/sdk/resource"
	sdktrace "go.opentelemetry.io/otel/sdk/trace"
	semconv "go.opentelemetry.io/otel/semconv/v1.26.0"
)

func initTracing(ctx context.Context, service string) (func(context.Context) error, error) {
	exp, err := otlptracegrpc.New(ctx) // 送信先は OTEL_EXPORTER_OTLP_ENDPOINT 環境変数
	if err != nil {
		return nil, err
	}
	res, _ := resource.New(ctx, resource.WithAttributes(semconv.ServiceName(service)))
	tp := sdktrace.NewTracerProvider(
		sdktrace.WithBatcher(exp),                       // バッチ送信（性能・コスト）
		sdktrace.WithSampler(sdktrace.ParentBased(sdktrace.TraceIDRatioBased(0.1))), // 10%サンプリング
		sdktrace.WithResource(res),
	)
	otel.SetTracerProvider(tp)
	otel.SetTextMapPropagator(propagation.TraceContext{}) // W3C 伝播
	return tp.Shutdown, nil // ← main の defer で呼び、未送信 span を flush
}
```

```go
// main 側：起動時に初期化、終了時に flush
shutdown, err := initTracing(ctx, "user-api")
if err != nil { /* ... */ }
defer shutdown(context.Background()) // グレースフル停止時に未送信分を送る
```

> **Cost optimization**: in production, hold trace volume and cost down with **sampling** (e.g., `TraceIDRatioBased(0.1)` for 10%). "Tail sampling," which prioritizes keeping errors and slow traces, is done on the Collector side. Make the destination configurable via `OTEL_EXPORTER_OTLP_ENDPOINT` as an [environment variable](/blog/go-echo-deployment-docker-distroless-ecs-cloud-run-graceful-shutdown-guide#1-設定は-12-factor-で環境変数から読む) and don't bake the endpoint into code.

---

## Conclusion: 7 principles for bringing Echo observability to production quality

1. **Correlate the three pillars by `trace_id`** so one request can be traced end-to-end.
2. **otelecho is deprecated and assumes v4. Custom middleware that uses the OTel SDK directly** is version-independent and robust (ETC).
3. **For traces, propagation is everything.** `Extract` → span → **`c.SetRequest(req.WithContext(ctx))`** threads through downstream.
4. **Normalize span names and attributes to the route pattern** to prevent cardinality explosion.
5. **Metrics are RED** (duration Histogram + in-flight UpDownCounter) in a minimal setup.
6. **Put `trace_id` on slog** so logs ↔ traces can be jumped between mutually.
7. **Export with OTLP, optimize cost with sampling**, and flush on shutdown.

Observability is not "emitting logs" but "**making guesswork zero during an incident.**" On Echo v5, thinly instrumenting the OTel SDK yourself without relying on deprecated tools turns out to be robust, cheap, and understandable. For cross-platform observability go to the [OpenTelemetry practical guide](/blog/opentelemetry-observability-production-tracing-metrics-logs), and for the full picture of Echo go to the [production-operations guide](/blog/go-echo-framework-production-guide).
