# Cloud Run CI/CD: keyless, Blue/Green, and canary in real code with Cloud Build / GitHub Actions × Workload Identity

> An implementation guide for building production-quality continuous deployment to Cloud Run. It explains, with real code in cloudbuild.yaml, GitHub Actions, and gcloud: Artifact Registry, when to use Cloud Build vs. GitHub Actions (keyless via Workload Identity Federation), verifying first with --no-traffic + a tag URL then canary → Blue/Green → instant rollback, separating DB migrations into a job, and dividing responsibilities with Terraform.

- Published: 2026-06-28
- Author: 友田 陽大
- Tags: GCP, Cloud Run, CI/CD, DevOps, Workload Identity, セキュリティ, インフラ, Terraform
- URL: https://tomodahinata.com/en/blog/google-cloud-run-cicd-cloud-build-github-actions-workload-identity-blue-green-canary-guide
- Category: Google Cloud Run in production
- Pillar guide: https://tomodahinata.com/en/blog/google-cloud-run-production-guide

## Key points

- The backbone of Cloud Run CI/CD is the responsibility split: 'build the image in CI and push to Artifact Registry, deploy immutable revisions to Cloud Run, manage infrastructure with Terraform.' This prevents drift and accidents.
- Keyless is essential. GitHub Actions authenticates with Workload Identity Federation without issuing a service-account key. Don't forget permissions: id-token: write.
- Safe shipping is: '--no-traffic + tag URL for isolated verification → update-traffic for a 5% canary → step up gradually → 100% (Blue/Green) → instant rollback to the old revision on a problem.' Because revisions are immutable, you can revert without rebuilding.
- Don't mix DB migrations with deployment; split them into a dedicated Cloud Run Job. Decouple app rollout from schema change, and apply in forward/backward-compatible stages.
- Cloud Build is self-contained and GCP-native; GitHub Actions has a broad ecosystem. For greenfield, assume WIF keyless and choose based on your team's existing CI.

---

"Deployment is scary" — it's the feeling you most want to avoid on a production container platform. The true identity of that fear is **not being able to revert** and **not knowing what changed.** Cloud Run CI/CD can structurally crush both. Because revisions are immutable, you can **revert instantly without rebuilding**, and if you separate responsibilities, **what changed is always clear.**

While [operating a broadcaster platform on GCP](/case-studies/broadcaster-ai-content-platform), I ran a never-stopping internal platform with a configuration where I **separated stg/prod with Cloud Build, split responsibilities so Terraform owns 'infrastructure' and Cloud Build owns 'the image and latest env,' carved out DB migrations into a dedicated job, and made CI/CD keyless with Workload Identity Federation.** This article reproduces that design in real code, **faithful to the [Google Cloud official documentation](https://docs.cloud.google.com/run/docs/continuous-deployment-with-cloud-build).**

For the full picture of production operation see the [Cloud Run production-operations guide](/blog/google-cloud-run-production-guide), and for the design of long-running jobs themselves the [Jobs / Workflows guide](/blog/google-cloud-run-jobs-workflows-batch-async-idempotent-guide).

---

## Design principle: separate the three responsibilities

Accidents in Cloud Run CI/CD usually happen when **responsibilities are mixed.** Draw the boundaries first.

| Responsibility | What carries it | Source of truth |
|------|---------|-------|
| **App contents** | The container image (build in CI → Artifact Registry) | Git (commit SHA = image tag) |
| **Infrastructure** | Service, SA, VPC, scaling settings (Terraform) | Terraform state |
| **Which revision to route to** | Traffic allocation (immutable revisions) | Cloud Run's traffic setting |

This separation works because — if you make **"image tag = commit SHA,"** you can **uniquely track which commit** is running in production, and **infrastructure changes (Terraform) and app changes (image) don't mix.** Don't use the `latest` tag (you lose track of what's running).

---

## Artifact Registry: where images live

Images go in **Artifact Registry** (formerly Container Registry). First create the repository.

```bash
gcloud artifacts repositories create app \
  --repository-format=docker \
  --location=asia-northeast1 \
  --description="app container images"
# イメージURLの形：asia-northeast1-docker.pkg.dev/PROJECT_ID/app/api:GIT_SHA
```

---

## Path A: Cloud Build (self-contained, GCP-native)

If you want everything within GCP, Cloud Build. Declare "build → push → deploy" in `cloudbuild.yaml`.

```yaml
# cloudbuild.yaml — push trigger で起動。$SHORT_SHA はCloud Buildが注入する。
steps:
  # 1. ビルド（コミットSHAをタグに）
  - name: "gcr.io/cloud-builders/docker"
    args:
      ["build", "-t",
       "${_REGION}-docker.pkg.dev/$PROJECT_ID/app/api:$SHORT_SHA", "."]
  # 2. Artifact Registry へプッシュ
  - name: "gcr.io/cloud-builders/docker"
    args:
      ["push",
       "${_REGION}-docker.pkg.dev/$PROJECT_ID/app/api:$SHORT_SHA"]
  # 3. トラフィックを流さずにデプロイ（タグURLで検証してから昇格する）
  - name: "gcr.io/google.com/cloudsdktool/cloud-sdk"
    entrypoint: gcloud
    args:
      ["run", "deploy", "api",
       "--image", "${_REGION}-docker.pkg.dev/$PROJECT_ID/app/api:$SHORT_SHA",
       "--region", "${_REGION}",
       "--no-traffic", "--tag", "sha-$SHORT_SHA"]
images:
  - "${_REGION}-docker.pkg.dev/$PROJECT_ID/app/api:$SHORT_SHA"
substitutions:
  _REGION: asia-northeast1
options:
  logging: CLOUD_LOGGING_ONLY
```

Connect a push trigger to the GitHub repository, and every commit automatically runs build and deploy (without routing traffic).

```bash
gcloud builds triggers create github \
  --repo-name=app --repo-owner=YOUR_ORG \
  --branch-pattern="^main$" \
  --build-config=cloudbuild.yaml
```

---

## Path B: GitHub Actions × Workload Identity (keyless)

If your existing CI is GitHub Actions, this is natural. **Without issuing a service-account key,** authenticate to GCP with Workload Identity Federation (WIF).

```yaml
# .github/workflows/deploy.yml
name: deploy
on:
  push:
    branches: [main]

permissions:
  contents: read
  id-token: write   # これが無いとGitHubはOIDCトークンを注入せず、認証が失敗する

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      # 鍵レス認証：プールとプロバイダはWIFで事前設定（下記リンク参照）
      - id: auth
        uses: google-github-actions/auth@v3
        with:
          # ★プロジェクト「番号」を含むフルパス。プロジェクトIDではない。
          workload_identity_provider: "projects/123456789/locations/global/workloadIdentityPools/github/providers/app-repo"
          service_account: "deployer@PROJECT_ID.iam.gserviceaccount.com"

      - uses: google-github-actions/deploy-cloudrun@v3
        with:
          service: api
          region: asia-northeast1
          image: asia-northeast1-docker.pkg.dev/PROJECT_ID/app/api:${{ github.sha }}
          flags: "--no-traffic --tag=sha-${{ github.sha }}"
```

> **The WIF pool/provider setup (allowing only your own repository with an Attribute Condition, etc.) is not repeated in this article.** The key points of the setup — always include a match on `assertion.repository`, never wildcard `sub` — are collected in the dedicated article [making GitHub Actions keyless](/blog/github-actions-oidc-keyless-cicd-aws-gcp-guide) (DRY). Give the deploy SA only the minimum privileges (`roles/run.developer` + Artifact Registry read + `roles/iam.serviceAccountUser` on the runtime SA).

---

## Safe shipping: verify → canary → Blue/Green → instant rollback

The key is to stop CI at "deploy without routing traffic." **Promote after a human (or an automated check) verifies.** Precisely because revisions are immutable, this staged control works safely.

```bash
# 1. タグURLで隔離検証（本番トラフィックに影響しない）
#    → https://sha-abc123---api-xxxxx.a.run.app をスモークテスト
curl -H "Authorization: Bearer $(gcloud auth print-identity-token)" \
  https://sha-abc123---api-xxxxx.a.run.app/healthz

# 2. 健全なら5%だけカナリア
gcloud run services update-traffic api --region asia-northeast1 \
  --to-tags sha-abc123=5

# 3. エラー率・レイテンシを監視しつつ段階引き上げ（5 → 25 → 50%）
gcloud run services update-traffic api --region asia-northeast1 \
  --to-tags sha-abc123=50

# 4. 問題なければ100%へ（Blue/Green切替）
gcloud run services update-traffic api --region asia-northeast1 --to-latest

# ── 異常を検知したら、旧リビジョンへ即時ロールバック（再ビルド不要）──
gcloud run services update-traffic api --region asia-northeast1 \
  --to-revisions api-00021-prev=100
```

**The fact that rollback completes by "just sending 100% back to the old revision"** is Cloud Run's greatest safety device. No rebuilding the image, no redoing the deployment. Build this into your standard CI/CD procedure, and even a nighttime incident returns to normal in tens of seconds.

---

## Separate DB migrations from deployment

The most accident-prone thing is a **schema change.** Mixing app rollout and migration into the same step lands you in an "old code, new schema" inconsistency on rollback. The correct answer is to **carve it out into a dedicated Cloud Run Job and apply in forward/backward-compatible stages.**

```bash
# マイグレーション専用ジョブを用意し、デプロイとは独立に実行する
gcloud run jobs deploy db-migrate \
  --image asia-northeast1-docker.pkg.dev/PROJECT_ID/app/migrate:${GIT_SHA} \
  --region asia-northeast1 \
  --service-account migrator@PROJECT_ID.iam.gserviceaccount.com \
  --max-retries 0           # マイグレーションは安易にリトライさせない
gcloud run jobs execute db-migrate --region asia-northeast1 --wait
```

Make zero-downtime schema changes a **multi-stage release**: "① add a compatible column → ② deploy code that supports both old and new → ③ backfill → ④ code that removes old references → ⑤ drop the old column." For the design details see [zero-downtime schema migration](/blog/postgresql-zero-downtime-schema-migration-lock-safe-ddl-guide) (the principles are the same on Cloud SQL/PostgreSQL). For building out the job itself, go to the [Jobs / Workflows guide](/blog/google-cloud-run-jobs-workflows-batch-async-idempotent-guide).

---

## Cloud Build or GitHub Actions: which to choose

| | **Cloud Build** | **GitHub Actions** |
|---|----------------|---------------------|
| Authentication | Natively easy since it's inside GCP | **Keyless with WIF** (setup required) |
| Ecosystem | Optimized for GCP | **Broad** (easy to integrate lint/test/other clouds) |
| Suited team | GCP-centric, wants infra leaned on Cloud Build too | Already standardized on GitHub Actions |
| Build environment | Managed, parallel, caching | Runners (self-hosted possible) |

**The right answer is "lean toward your team's existing CI."** Both can build the same **shipping flow** of `--no-traffic` + tag verification → canary → Blue/Green. In my project I consolidated the build/deploy core in Cloud Build while running CodeQL, dependency updates, and tests on the GitHub side — a combined configuration.

---

## Production-rollout checklist

- [ ] Image tag is the **commit SHA** (don't use `latest`)
- [ ] Responsibility split: **Terraform = infra / image = app**
- [ ] CI/CD is **keyless with WIF** (add `id-token: write`)
- [ ] The deploy SA has **minimum privileges** (`run.developer` + AR read + `serviceAccountUser`)
- [ ] CI stops at **`--no-traffic` + `--tag`.** Promote after verification
- [ ] Script the staged shipping of **canary → Blue/Green**
- [ ] Put the **instant-rollback procedure** (100% to the old revision) in the runbook
- [ ] **DB migrations split into a dedicated job** and applied in stages
- [ ] Pass production-equivalent verification (including the WAF) in stg first

---

## Conclusion: make deployment a "not scary" task

Cloud Run CI/CD can structurally erase the fear of "can't revert, don't know what changed" with **responsibility separation (image/infra/traffic)** and **staged shipping via immutable revisions.** Keyless (WIF) also severs the credential-leak risk, and separating migrations prevents inconsistency. With this, **even a small team can carry out production deployments matter-of-factly.**

For the overall design go to the [Cloud Run production-operations guide](/blog/google-cloud-run-production-guide), for cost the [concurrency/billing guide](/blog/google-cloud-run-autoscaling-concurrency-billing-cost-optimization-guide), and for long-running processing the [Jobs / Workflows guide](/blog/google-cloud-run-jobs-workflows-batch-async-idempotent-guide). If you need accompaniment on building out GCP CI/CD or going keyless, I'll help based on real operational experience.
