Cloud Run networking and security: defense in depth with Ingress control, IAM auth, Direct VPC egress, and Cloud Armor

If you "just deploy" Cloud Run, it can end up exposed to the entire internet without authentication. In production, lock down both the entrance (who may come) and the exit (where it may go) with least privilege. This is the most cost-effective defense, achievable without changing a single line of app code.

On the broadcaster platform, I IaC'd a defense-in-depth — Cloud SQL with IAM auth, mandatory TLS (ENCRYPTED_ONLY), and a private IP; the entrance with Cloud Armor (OWASP CRS 3.3 + adaptive DDoS + rate limiting); each service with its own least-privilege service account; secrets in Secret Manager (referencing only the latest version) — with Terraform, and operated it so as to fully enable the WAF in stg and eliminate false positives before production. A configuration that withstands broadcaster-grade internal controls.

This article reproduces those key points faithfully to the official documentation. For the big picture, see the Cloud Run production-operations guide.

Entrance ①: decide "who can reach it" with the Ingress setting

--ingress is the service's network boundary itself. There are three values.

Value	Allowed reach path	When to use
`all`	Everything, including direct access to the `run.app` URL	An API you truly want to publish (but use with auth required)
`internal`	Internal ALB, internal traffic of the same project/VPC, Google Cloud services (Cloud Scheduler / Cloud Tasks / Eventarc / Pub/Sub / Workflows, etc.), inside a VPC Service Controls perimeter	Internal API, backend service
`internal-and-cloud-load-balancing`	The internal paths above + via the external ALB. Direct access to `run.app` is blocked	Publish, but always force through the LB (= Cloud Armor)

# 公開面：外部ALB（＋Cloud Armor）経由のみ許可。run.app直叩きを塞ぐ
gcloud run deploy api --region asia-northeast1 \
  --ingress internal-and-cloud-load-balancing

# 内部API：同一VPCとGoogle Cloudサービスからのみ
gcloud run deploy internal-api --region asia-northeast1 \
  --ingress internal

The point: the standard is to make a public service internal-and-cloud-load-balancing rather than all, always forcing it through the external load balancer = Cloud Armor. Blocking the direct run.app URL prevents direct attacks that bypass the WAF. At the organization level, you can restrict the choices themselves with the run.allowedIngress organization policy.

Entrance ②: protect "service-to-service" with IAM authentication

If Ingress is the "network path," IAM is the "right to call." Don't make service-to-service calls unauthenticated — that's the iron rule.

# サービスを認証必須に（既定でこうする）
gcloud run deploy api --region asia-northeast1 --no-allow-unauthenticated

# 呼び出し側（別サービス/Scheduler/Eventarc）のSAに invoker 権限を与える
gcloud run services add-iam-policy-binding api --region asia-northeast1 \
  --member "serviceAccount:caller@PROJECT_ID.iam.gserviceaccount.com" \
  --role "roles/run.invoker"

The caller calls with an ID token whose audience is the destination service URL.

# サービスAからサービスBを認証付きで呼ぶ（IDトークンを自動取得）
import google.auth.transport.requests
import google.oauth2.id_token
import httpx

def call_internal(url: str, payload: dict) -> dict:
    auth_req = google.auth.transport.requests.Request()
    # 宛先URLをaudienceにしたIDトークンをメタデータサーバから取得
    token = google.oauth2.id_token.fetch_id_token(auth_req, url)
    r = httpx.post(url, json=payload, headers={"Authorization": f"Bearer {token}"})
    r.raise_for_status()
    return r.json()

Unauthenticated exposure not only "widens the attack surface" but turns wasteful requests directly into cost. Explicitly open only the endpoints that truly need to be public.

Exit: to VPC resources with Direct VPC egress

To get from Cloud Run to Cloud SQL's private IP, Memorystore, or an internal API, use Direct VPC egress (officially recommended, GA). Unlike the legacy Serverless VPC Access connector, no connector VM is needed, so idle cost, latency, and operations disappear.

gcloud run deploy api --region asia-northeast1 \
  --network projects/PROJECT_ID/global/networks/my-vpc \
  --subnet projects/PROJECT_ID/regions/asia-northeast1/subnetworks/run-subnet \
  --vpc-egress private-ranges-only   # プライベート宛のみVPCへ。外部はそのまま

Lock down Cloud SQL with "private IP, IAM auth, mandatory TLS"

The DB is the most important exit to protect for Cloud Run. Don't give it a public IP; connect only via the private IP through Direct VPC egress, with IAM authentication (password-less) and mandatory TLS.

# Cloud SQL（PostgreSQL）へIAM認証＋TLSで接続（Cloud SQL Python Connector）
from google.cloud.sql.connector import Connector, IPTypes

connector = Connector()
def getconn():
    return connector.connect(
        "PROJECT_ID:asia-northeast1:my-instance",
        "pg8000",
        user="api-runtime@PROJECT_ID.iam",   # SAのIAMユーザー（パスワードを持たない）
        db="appdb",
        enable_iam_auth=True,                 # IAM認証
        ip_type=IPTypes.PRIVATE,              # プライベートIP
    )

For connection-exhaustion countermeasures (pool design, PgBouncer) in serverless, see serverless connection pooling (on Cloud Run too, the same problem of each instance opening connections occurs at scale-out).

The shield for the public surface: Cloud Armor (WAF, rate limiting, DDoS)

Attach a Cloud Armor security policy to the front stage of a public service (the external ALB's backend service) and scrub attacks at L7. The main capabilities are —

Preconfigured WAF rules: derived from OWASP ModSecurity CRS. Detect SQLi, XSS, LFI, RCE, etc., with many signatures.
Rate limiting: throttle clients exceeding a threshold, or temporarily ban them (rate-based-ban).
Adaptive Protection: detect L7 DDoS anomalies with ML.
Custom rules (CEL): flexibly match on L3–L7 attributes (up to 5 sub-expressions per rule). Rules are evaluated in order of smallest priority.

# Cloud Armor：OWASP WAF＋レート制限を宣言（外部ALBのバックエンドにアタッチ）
resource "google_compute_security_policy" "api" {
  name = "api-armor"

  # 適応型保護（L7 DDoSのML検知）
  adaptive_protection_config {
    layer_7_ddos_defense_config { enable = true }
  }

  # レート制限：1分100リクで超過分を一時BAN
  rule {
    action   = "rate_based_ban"
    priority = 1000
    match { versioned_expr = "SRC_IPS_V1"
            config { src_ip_ranges = ["*"] } }
    rate_limit_options {
      enforce_on_key = "IP"
      rate_limit_threshold { count = 100  interval_sec = 60 }
      ban_duration_sec     = 600
      conform_action       = "allow"
      exceed_action        = "deny(429)"
    }
  }

  # preconfigured WAF：SQLインジェクション検知（XSS等も同様に追加）
  rule {
    action   = "deny(403)"
    priority = 2000
    match { expr { expression = "evaluatePreconfiguredExpr('sqli-v33-stable')" } }
  }

  # 既定ルール（最後＝最大優先度番号）：許可
  rule {
    action   = "allow"
    priority = 2147483647
    match { versioned_expr = "SRC_IPS_V1"
            config { src_ip_ranges = ["*"] } }
  }
}

If you set the WAF to "deny in production all at once," it sweeps up legitimate requests. I operated it so as to fully enable the WAF in stg, first surface false positives in preview (log only), then enforce in production. The thinking on defense in depth (AWS WAF / Cloud Armor / OWASP) is detailed in the WAF defense-in-depth guide.

Erase credentials: least-privilege SAs and Secret Manager

Even if you lock down the network, too-broad privileges or plaintext secrets ruin it.

Assign each service a dedicated least-privilege user-managed SA (--service-account). With nothing specified, it often runs with the Compute Engine default SA that has Editor privileges. Disable automatic grants to default SAs with the iam.automaticIamGrantsForDefaultServiceAccounts organization policy.
Inject secrets from Secret Manager (environment variable = fixed at startup with a version specified; volume = always latest, suited to rotation). Give the SA roles/secretmanager.secretAccessor.

The real config code is consolidated in the security section of the Cloud Run production-operations guide (not repeated here, for DRY).

The big picture of defense in depth

From entrance to exit, stack the layers. Defense in depth means that even if one layer is breached, the next stops it.

インターネット
   │
   ▼ 外部ALB ── Cloud Armor（WAF / レート制限 / 適応型DDoS）   ← 入口の予防
   │
   ▼ Ingress = internal-and-cloud-load-balancing（run.app直叩きを遮断） ← 経路の制御
   │
   ▼ Cloud Run サービス（--no-allow-unauthenticated / IAM invoker） ← 呼び出しの認可
   │      ・最小権限の専用SA               ← 権限の最小化
   │      ・Secret Manager（秘密）          ← 認証情報をコードから排除
   │
   ▼ Direct VPC egress（private-ranges-only） ← 出口の制御
   │
   ▼ Cloud SQL（プライベートIP / IAM認証 / TLS必須）  ← データの保護

Production-rollout checklist

Public service is internal-and-cloud-load-balancing (block direct run.app)
Internal service is internal
Make --no-allow-unauthenticated the default, calls with IAM invoker + ID token
VPC connection is Direct VPC egress (don't use a connector)
Cloud SQL is private IP, IAM auth, mandatory TLS, with a pool against connection exhaustion
Cloud Armor (WAF, rate limiting, adaptive protection) on the public surface, preview in stg → enforce in production
A dedicated least-privilege SA per service, don't use the default SA
Secrets in Secret Manager, don't write credentials in code
Enable audit logs (who, when, what)

Conclusion: lock down the entrance and exit with least privilege

Cloud Run security isn't flashy features but the steady accumulation of locking down "entrance, exit, privilege, secrets" one by one to the minimum. Path with Ingress, calls with IAM, exit with Direct VPC egress and a private IP, the public surface with Cloud Armor, privilege and secrets with a least-privilege SA and Secret Manager — stacked as layers, if one is breached the next stops it.

From the experience of building configurations that withstand a broadcaster's internal controls, most of these can be achieved with settings and IaC without changing code. For the overall design, go to the Cloud Run production-operations guide, and for keyless CI/CD, the CI/CD guide. If you need a security audit or defense-in-depth design, I'll accompany you through to implementation.

Cloud Run networking and security: defense in depth with Ingress control, IAM auth, Direct VPC egress, and Cloud Armor

Entrance ①: decide "who can reach it" with the Ingress setting

Entrance ②: protect "service-to-service" with IAM authentication

Exit: to VPC resources with Direct VPC egress

Lock down Cloud SQL with "private IP, IAM auth, mandatory TLS"

The shield for the public surface: Cloud Armor (WAF, rate limiting, DDoS)

Erase credentials: least-privilege SAs and Secret Manager

The big picture of defense in depth

Production-rollout checklist

Conclusion: lock down the entrance and exit with least privilege

Google Cloud Run Production-Operations Guide: Container Contract, Concurrency, Auto-Scale, Deploy, Cost, and Security in Real Code

Cloud Run concurrency, autoscaling, billing model, and cost optimization: conquering scale-to-zero and cold starts in real code

Cloud Run CI/CD: keyless, Blue/Green, and canary in real code with Cloud Build / GitHub Actions × Workload Identity

Cloud Run Jobs and Cloud Workflows: designing long-running batch and parallel processing to be idempotent and resumable

Also worth reading

ECS on Fargate Networking Design Complete Guide: Building awsvpc, ALB/NLB, Service Connect, and VPC Endpoints at Production Quality

Azure Container Apps network-design guide: VNet integration, internal environment, Private Endpoint, WAF, and egress lockdown

DynamoDB Security Complete Guide (2026 Edition): IAM Least Privilege, Fine-Grained Access Control (LeadingKeys), Encryption at Rest/in Transit, VPC Endpoints

Entrance ①: decide "who can reach it" with the Ingress setting

Entrance ②: protect "service-to-service" with IAM authentication

Exit: to VPC resources with Direct VPC egress

Lock down Cloud SQL with "private IP, IAM auth, mandatory TLS"

The shield for the public surface: Cloud Armor (WAF, rate limiting, DDoS)

Erase credentials: least-privilege SAs and Secret Manager

The big picture of defense in depth

Production-rollout checklist

Conclusion: lock down the entrance and exit with least privilege

Related articles

Google Cloud Run Production-Operations Guide: Container Contract, Concurrency, Auto-Scale, Deploy, Cost, and Security in Real Code

Cloud Run concurrency, autoscaling, billing model, and cost optimization: conquering scale-to-zero and cold starts in real code

Cloud Run CI/CD: keyless, Blue/Green, and canary in real code with Cloud Build / GitHub Actions × Workload Identity

Cloud Run Jobs and Cloud Workflows: designing long-running batch and parallel processing to be idempotent and resumable

Also worth reading

ECS on Fargate Networking Design Complete Guide: Building awsvpc, ALB/NLB, Service Connect, and VPC Endpoints at Production Quality

Azure Container Apps network-design guide: VNet integration, internal environment, Private Endpoint, WAF, and egress lockdown

DynamoDB Security Complete Guide (2026 Edition): IAM Least Privilege, Fine-Grained Access Control (LeadingKeys), Encryption at Rest/in Transit, VPC Endpoints