可観測性・SRE（OpenTelemetry / SLO）の実践ガイド

可観測性は「ログを出すこと」ではなく「止まった処理を一目で追えること」です。OpenTelemetryで三本柱(ログ・メトリクス・トレース)を相関させ、構造化ログに相関IDを通し、SLO/エラーバジェットで判断し、原因ではなく症状でアラートを鳴らす——本番の信頼性を数字で運用する設計を扱います。

3 articles in total

Foundational guide (start here)

可観測性

OpenTelemetry Production Observability Guide: Correlating Traces, Metrics, and Logs So You Can Spot a Stuck Process at a Glance

An implementation guide for making production systems observable with OpenTelemetry. From the concepts of the three signals (traces / metrics / logs) and context propagation, to instrumenting FastAPI (Python) and Next.js (Node), the OTel Collector, head/tail sampling, log-to-trace correlation, PII scrubbing, and telemetry cost optimization — explained with official-spec-compliant, real code.

6/24/202621 min read

可観測性・SRE（OpenTelemetry / SLO）の実践ガイド

OpenTelemetry Production Observability Guide: Correlating Traces, Metrics, and Logs So You Can Spot a Stuck Process at a Glance

Related practical articles

A practical guide to incident response 2026: designing Incident Commander, Runbooks, postmortems, and on-call the SRE way

AWS ECS Fargate SRE Practical Guide: ADOT Distributed Tracing, EMF Metrics, and SLO / Error Budget / Burn-Rate Alert Design