# Designing Defense-in-Depth with a WAF: Rolling Out AWS WAF / Cloud Armor's OWASP Countermeasures, Rate Limiting, and DDoS Mitigation to Production Without False Positives

> An implementation guide for building defense-in-depth in production with AWS WAF and Google Cloud Armor. Explained with real settings: Web ACLs / security policies, OWASP managed rules, rate limiting, DDoS / adaptive protection, and the operation of safely rolling out with count/staging without false positives. A WAF is 'one layer' of defense-in-depth, not a silver bullet — designed from that premise.

- Published: 2026-06-24
- Author: 友田 陽大
- Tags: セキュリティ, AWS, GCP, WAF, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/waf-defense-in-depth-aws-waf-cloud-armor-owasp-guide
- Category: Infrastructure, IaC & CI/CD
- Pillar guide: https://tomodahinata.com/en/blog/aws-ecs-vs-eks-startup-decision-framework

## Key points

- A WAF is just one layer — an L7 request filter — and doesn't replace authorization, input validation, or least privilege
- A WAF's biggest operational risk is not failing to block attacks but mistakenly blocking legitimate traffic
- For OWASP countermeasures, don't write regexes yourself; use AWS's managed rules or Cloud Armor's preconfigured WAF rules
- Always observe a new rule with Count (AWS) / preview (Cloud Armor) before promoting it to Block / enforce
- A safe rollout is the order of staging full enablement → production Count/preview → individual tuning → staged enforce → continuous observation

---

"As long as I put in a WAF, it's safe, right?" — at a project meeting, it's one of the most common and most dangerous questions.

A WAF (Web Application Firewall) is indeed powerful. SQL injection, XSS, and known attack patterns can be rejected at the entrance before the request reaches the app. But **a WAF is not a "silver bullet."** Authorization (who can do what), input validation (whether the received value is valid), and least privilege — a WAF replaces none of these. A WAF is just **one layer of defense-in-depth** — an L7 request filter.

And there's another truth that pays off precisely in the field. **A WAF's biggest operational risk is not "failing to block attacks" but "mistakenly blocking legitimate traffic."** The payment button returns 403, an API client is rejected, a search query is misjudged as SQLi — these happen with far higher probability than attacks, and by your own hand at that.

This article is an implementation guide for designing and operating defense-in-depth at **production quality** with **AWS WAF** and **Google Cloud Armor**. As the subject matter, I'll weave in design decisions from an [internal AI platform](/case-studies/broadcaster-ai-content-platform) I built for a major domestic broadcaster — placing **Cloud Armor (OWASP CRS 3.3 + adaptive DDoS protection + rate limiting)** at the entrance, and an operation of **fully enabling the WAF in staging to crush false positives before production.**

> **The rule of this article**: The specs, settings, and rule names are based on the **AWS / Google Cloud official documentation (as of June 2026)**. Because managed-rule versions and pricing get revised, always confirm the latest official information before going to production. The code is shaped into a form usable in real operations (Terraform-centric), but **a WAF is one layer of defense-in-depth and doesn't replace authorization, input validation, or least privilege.** And one more iron rule — **always start a new rule from Count (observe), and Block (enforce) comes after.**

---

## 0. Mental model: a WAF is "an L7 request filter"

Before starting the design, let me pin down in one line what a WAF is and isn't.

> **WAF = an L7 filter that evaluates HTTP/HTTPS requests by content (SQLi/XSS/known attack patterns) and rate (abnormal frequency), and Allows / Blocks / Counts / Challenges at the entrance.**

Three consequences emerge from here.

1. **A WAF only sees "the content of the communication."** Authorization logic (may this user access this resource) is the app's job. A WAF only judges "whether a request matches an attack pattern." **A WAF can't distinguish a legitimate admin access to `/admin` from an attacker's access.**
2. **A WAF is strong against "known patterns" and weak against "logic holes."** OWASP managed rules comprehensively reject known attacks, but they pass your app-specific authorization bugs (IDOR, privilege escalation) straight through. So **input validation and least privilege are separately needed.**
3. **That's exactly why defense-in-depth.** WAF (entrance) + authentication / MFA + authorization + input validation (Zod, etc.) + least-privilege IAM + Secrets management + network isolation. The WAF is this **outermost layer**, and it's no reason to omit the inner layers.

On the broadcaster's platform too, Cloud Armor is just one layer at the entrance, and inside it layers **Cloud SQL (IAM auth, TLS required, private IP), Secret Manager, least-privilege service accounts, and Identity Platform (SMS MFA, reCAPTCHA Enterprise).** The WAF is the filter at that entrance.

---

## 1. The map of the two major WAFs: AWS WAF and Cloud Armor

Different clouds mean different WAF structures and terms. First, grasp the big picture with a correspondence table.

| Concept | AWS WAF | Google Cloud Armor |
| --- | --- | --- |
| Container for rules | **Web ACL** (protection pack) | **security policy** |
| Application target | CloudFront / ALB / API Gateway / AppSync / Cognito user pool / App Runner / Verified Access / Amplify | External Application Load Balancer (backend / edge security policy) |
| OWASP countermeasures | **AWSManagedRulesCommonRuleSet** (CRS, WCU 700) | **preconfigured WAF rules** (OWASP ModSecurity CRS 3.3 / 4.22) |
| Rule actions | Allow / Block / **Count** / CAPTCHA / Challenge | allow / deny / throttle / rate_based_ban / redirect |
| Rate limiting | **rate-based rule** (`RateBasedStatement`) | **rateLimitOptions** (throttle / rate_based_ban) |
| Observe mode (surfacing false positives) | **Count action** / rule action override | **preview** flag |
| DDoS (L7) | Shield Standard (standard) / Shield Advanced (paid) | **Adaptive Protection** (ML-based L7 detection) |
| Logs | WAF logs → CloudWatch / S3 / Firehose | Cloud Logging |

**Which to use is almost decided by "which cloud's load balancer the app is under."** Under AWS's ALB/CloudFront, AWS WAF; under GCP's external Application Load Balancer, Cloud Armor. If multi-cloud, you assemble both with **the same philosophy (OWASP managed + rate limiting + Count/preview first).** This article handles both.

---

## 2. AWS WAF: assembling a Web ACL and managed rules with Terraform

### 2.1 What is a Web ACL (the official definition)

AWS WAF is a web application firewall that monitors HTTP/HTTPS requests forwarded to the protected resource. The container for requests is the **Web ACL (web access control list)**, and the resources it can protect are, officially, as follows.

- Amazon CloudFront distribution
- Amazon API Gateway REST API
- Application Load Balancer
- AWS AppSync GraphQL API
- Amazon Cognito user pool
- AWS App Runner service
- AWS Verified Access instance
- AWS Amplify

A Web ACL has a **default action** (what to do with requests that didn't match = `allow` or `block`), and within it arranges **rules** in priority order. For a public-facing site, "**Allow all except…**" = default Allow + Block attacks is the basic form.

The rule actions the official docs define are the next 4 families. Accurately pinning down this is the starting point of a safe rollout.

| Action | Meaning |
| --- | --- |
| **Allow** | Let a matched request through |
| **Block** | Reject a matched request with 403 or a custom response |
| **Count** | **Without changing handling**, just count the number of matches (for observation / testing) |
| **CAPTCHA / Challenge** | Suppress bots with a CAPTCHA / silent challenge |

> **How the official docs explain `Count`**: "The Count action can be used to count matched requests without changing how they're handled. It can also be used to test a new rule. When you want to allow/block based on a new characteristic, **you can first count with Count, confirm the configuration is correct, and then switch to allow/block.**" — **This is the backbone of this article.**

### 2.2 Managed rules: the main force of OWASP countermeasures

Writing SQLi/XSS regexes yourself is a foolish move (a DRY violation, and unmaintainable). AWS provides **managed rule groups**, and the main force is the following three **baseline** ones.

| Rule group | VendorName / Name | WCU | Purpose |
| --- | --- | --- | --- |
| Core rule set (CRS) | `AWS` / `AWSManagedRulesCommonRuleSet` | 700 | Broad vulnerability countermeasures including OWASP Top 10 (**recommended for all WAF use cases**) |
| Known bad inputs | `AWS` / `AWSManagedRulesKnownBadInputsRuleSet` | 200 | Block known malicious patterns like Log4j (CVE-2021-44228 and others) |
| Admin protection | `AWS` / `AWSManagedRulesAdminProtectionRuleSet` | 100 | Block external access to exposed admin-panel paths |

CRS's contents are composed of specific names (in part). All default to `Block` and attach labels (`awswaf:managed:aws:core-rule-set:*`).

- `NoUserAgent_HEADER` — block a missing `User-Agent` header
- `SizeRestrictions_BODY` — block a request body over 8KB (8,192 bytes)
- `SizeRestrictions_QUERYSTRING` — block a query string over 2,048 bytes
- `CrossSiteScripting_BODY` / `_QUERYARGUMENTS` / `_COOKIE` / `_URIPATH` — XSS patterns
- `GenericLFI_*` / `GenericRFI_*` — local/remote file inclusion (`../../` etc.)
- `EC2MetaDataSSRF_*` — EC2 metadata theft (SSRF)

Known bad inputs includes `Log4JRCE_HEADER/BODY/URIPATH/QUERYSTRING` (detects `${jndi:ldap://…}`), `JavaDeserializationRCE_*`, and so on. **These are the textbook of "rules you shouldn't write yourself,"** and AWS version-manages them.

### 2.3 Terraform: emit with Count, then move to Block

This is the core of this article. **Don't put managed rules into production with Block right away.** `SizeRestrictions_BODY` rejects bodies over 8KB unconditionally — your file-upload API or rich JSON payloads might start returning 403 from that day.

So **first apply it to production traffic with Count, and observe "what gets caught" with sampled requests and CloudWatch metrics.**

#### Step 1: observe managed rules with Count

```hcl
resource "aws_wafv2_web_acl" "app" {
  name        = "app-web-acl"
  description = "Defense-in-depth WAF for the app (count-first rollout)"
  scope       = "REGIONAL" # ALB / API Gateway 用。CloudFront は "CLOUDFRONT"

  # デフォルトは「許可」。公衆向けサイトの基本形（指定したものだけ弾く）
  default_action {
    allow {}
  }

  # --- OWASP Core Rule Set ---
  rule {
    name     = "AWSManagedRulesCommonRuleSet"
    priority = 10

    # rule group 自体は「ブロックを上書きしない」(none) が、
    # 個別ルールを Count に上書きして観測する = 誤検知の洗い出し
    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        vendor_name = "AWS"
        name        = "AWSManagedRulesCommonRuleSet"

        # 誤爆しやすい個別ルールだけ Count に落として様子を見る
        rule_action_override {
          name = "SizeRestrictions_BODY" # 8KB超ボディ。アップロードAPIで誤爆しがち
          action_to_use {
            count {}
          }
        }
        rule_action_override {
          name = "CrossSiteScripting_BODY" # リッチエディタ等で誤爆しがち
          action_to_use {
            count {}
          }
        }
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "CommonRuleSet"
      sampled_requests_enabled   = true # サンプリングされたリクエストを必ず有効に
    }
  }

  # --- Known bad inputs（Log4j 等。これはほぼ誤爆しないので最初から有効寄り）---
  rule {
    name     = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 20
    override_action {
      none {}
    }
    statement {
      managed_rule_group_statement {
        vendor_name = "AWS"
        name        = "AWSManagedRulesKnownBadInputsRuleSet"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "KnownBadInputs"
      sampled_requests_enabled   = true
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name                = "appWebAcl"
    sampled_requests_enabled   = true
  }

  tags = { Environment = "staging" }
}
```

The point is `rule_action_override` (the API's `RuleActionOverrides`). **While keeping the rule group as a whole enabled, you can "override" only individual rules you fear false positives on to Count.** Observe that rule's match count in CloudWatch for a few days, and once you can confirm "legitimate traffic isn't getting caught," remove `count {}` to return it to Block (the original action).

> **The difference between `override_action` and `rule_action_override` (beware confusion)**: `override_action { none {} }` is "use the rule group's actions as-is (don't override)," and `override_action { count {} }` is "**override all rules in the group to Count.**" On the other hand, `rule_action_override` is "override **per individual rule.**" A two-tier setup is safe: first observe the whole group with `override_action { count {} }` → once used to it, Count only the dangerous rules individually with `rule_action_override`.

#### Step 2: add rate limiting (rate-based rule)

OWASP rules see "content," but brute force and scraping are a problem of "frequency." Control frequency with a **rate-based rule.** The official high-level settings are as follows.

- **Evaluation window**: choose from `60` / `120` / `300` / `600` seconds. **The default is 300 (5 minutes).** "How many seconds back from the current time to count."
- **Rate limit**: the upper limit of requests within that window. **The minimum is 10.** Exceed it and the rule action is applied to subsequent matched requests.
- **Aggregation** (aggregation key): `IP` (source IP) / `FORWARDED_IP` (the leading IP of `X-Forwarded-For` etc.) / `CONSTANT` (all requests in a batch = Count all) / `CUSTOM_KEYS`.
- **Action**: any action `other than Allow` (= Block / Count / Challenge).

```hcl
  # web_acl リソースに追記する rule ブロック
  rule {
    name     = "RateLimitPerIP"
    priority = 5 # マネージドルールより前に評価したいので小さい優先度

    # ★ まずは Count。閾値が妥当か観測してから block に変える
    action {
      count {}
    }

    statement {
      rate_based_statement {
        limit                 = 2000 # 5分ウィンドウ・1IPあたり2000req。最小は10
        aggregate_key_type    = "IP"
        evaluation_window_sec = 300 # 60/120/300/600。デフォルト300

        # scope-down: 重い検索エンドポイントだけに絞ることも可能
        scope_down_statement {
          byte_match_statement {
            field_to_match { uri_path {} }
            positional_constraint = "STARTS_WITH"
            search_string         = "/api/search"
            text_transformation {
              priority = 0
              type     = "NONE"
            }
          }
        }
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "RateLimitPerIP"
      sampled_requests_enabled   = true
    }
  }
```

**How to decide the threshold (`limit`) is "after observing."** Measure the legitimate users' normal-time peak req/IP/5min in CloudWatch and place its **2–3×** as the initial threshold. Set it to Block from the start and, in cases where many users appear as the same IP under a CDN/proxy (office NAT, mobile-carrier CGNAT), you drag in legitimate users. So emit it with `action { count {} }`, confirm zero false positives, then change it to `block {}`.

> **The effect of scope-down (SRP)**: applying rate limiting **not to all paths but only to heavy endpoints (search, login, generation APIs)** avoids dragging in static assets and light GETs. "One rule handles only one concern" — `scope_down_statement` realizes that.

#### Step 3: switch to Block once observation is done

Observe with Count for a few days to two weeks, and once you confirm "sampled requests don't include legitimate traffic," just return `count {}` to the original action.

```hcl
  # RateLimitPerIP の action を Count から Block に
  rule {
    name     = "RateLimitPerIP"
    priority = 5
    action {
      block {} # 観測完了 → 強制へ
    }
    # statement / visibility_config は同じ
  }
```

This is the safe rollout of **Count (observe) → Block (enforce).** The diff is one line — but before that one line, always place the "evidence" of observation (verification first).

---

## 3. Google Cloud Armor: security policy and preconfigured WAF rules

### 3.1 What is a security policy

Cloud Armor's **security policy** is a configuration unit that protects an app under a load balancer from DDoS and web attacks. It has prioritized **rules** (match condition + action).

- **backend security policy**: a policy tied to a backend service (OWASP rules, rate limiting, Adaptive Protection go here).
- **edge security policy**: a policy that filters at the edge in front of the CDN cache.
- The application target is an **external Application Load Balancer** (including classic).

A rule's **action** is `allow` / `deny(403|404|502)` / `throttle` / `rate_based_ban` / `redirect`. And the most important — the **`preview` flag.**

> **`preview` = Cloud Armor's version of Count.** Create a rule with `--preview` and **the match is recorded in logs but isn't actually blocked.** With the same philosophy as AWS's Count, you can **observe "what matches" before enforcing it in production.**

### 3.2 preconfigured WAF rules (OWASP CRS)

Cloud Armor provides the OWASP ModSecurity Core Rule Set as **predefined rules.** The available CRS versions are **4.22 / 3.3 / 3.0** (3.0 is deprecated). In the broadcaster's project, I adopted **CRS 3.3.** Rules are referenced with the `evaluatePreconfiguredWaf()` expression (representative examples of the CRS 3.3 family).

| Predefined rule | What it defends against |
| --- | --- |
| `sqli-v33-stable` | SQL injection |
| `xss-v33-stable` | Cross-site scripting |
| `lfi-v33-stable` | Local file inclusion |
| `rfi-v33-stable` | Remote file inclusion |
| `rce-v33-stable` | Remote code execution |
| `scannerdetection-v33-stable` | Scanner detection |
| `protocolattack-v33-stable` | Protocol attacks |
| `sessionfixation-v33-stable` | Session fixation |

**The sensitivity level is 0–4.** Officially, "by default Cloud Armor operates at **sensitivity level 4**, and after enablement evaluates all signatures in the rule set." The higher the sensitivity, the wider the detection, but **the more false positives** — that's exactly why preview is needed.

### 3.3 gcloud / Terraform: emit with preview, then enforce

#### Add OWASP rules with preview (observe)

```bash
# セキュリティポリシーを作成
gcloud compute security-policies create app-armor-policy \
  --description "Defense-in-depth WAF (preview-first rollout)"

# OWASP SQLi ルールを *preview* で追加（マッチはログるが弾かない）
gcloud compute security-policies rules create 1000 \
  --security-policy app-armor-policy \
  --description "OWASP SQLi (CRS 3.3) - PREVIEW" \
  --expression "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1})" \
  --action deny-403 \
  --preview   # ★ ここが肝。preview = 観測のみ、ブロックしない

# XSS も同様に preview で
gcloud compute security-policies rules create 1010 \
  --security-policy app-armor-policy \
  --expression "evaluatePreconfiguredWaf('xss-v33-stable', {'sensitivity': 1})" \
  --action deny-403 \
  --preview
```

Note `{'sensitivity': 1}`. **Don't start sensitivity at 4 (all signatures) from the start; start low (1), observe false positives with preview, and raise it if needed** — that's the safe side. Scrutinize requests where `previewSecurityPolicyName` matched in Cloud Logging, and once you confirm legitimate traffic isn't included, remove preview.

```bash
# 観測完了 → preview を解除して enforce（強制）に切り替え
gcloud compute security-policies rules update 1000 \
  --security-policy app-armor-policy \
  --no-preview
```

#### The same philosophy in Terraform too (preview = true → false)

```hcl
resource "google_compute_security_policy" "app" {
  name        = "app-armor-policy"
  description = "Defense-in-depth WAF (preview-first)"

  # OWASP SQLi（CRS 3.3）— まずは preview で観測
  rule {
    action   = "deny(403)"
    priority = 1000
    preview  = true # ★ 観測モード。enforce 前に誤検知を洗い出す
    match {
      expr {
        expression = "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1})"
      }
    }
    description = "OWASP CRS 3.3 SQLi (preview)"
  }

  # デフォルトルール：マッチしないものは許可（優先度は最大値固定）
  rule {
    action   = "allow"
    priority = 2147483647
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "default allow"
  }
}
```

### 3.4 Rate limiting (throttle / rate_based_ban)

Cloud Armor's rate limiting has two actions.

- **throttle**: let through up to the threshold, and reject the excess with `exceed_action` (`deny(429)` etc.).
- **rate_based_ban**: **ban for a fixed period** a client that exceeds the threshold.

```hcl
  # レート制限ルール（IPごと・1分100リクエスト超で10分 ban）
  rule {
    action   = "rate_based_ban"
    priority = 900
    preview  = true # ★ 閾値が妥当か観測してから enforce
    match {
      versioned_expr = "SRC_IPS_V1"
      config { src_ip_ranges = ["*"] }
    }
    rate_limit_options {
      enforce_on_key = "IP" # ALL / IP / HTTP_HEADER / HTTP_COOKIE / XFF_IP / ...
      conform_action = "allow"
      exceed_action  = "deny(429)"
      rate_limit_threshold {
        count        = 100
        interval_sec = 60 # 10/30/60/120/.../3600 から
      }
      ban_duration_sec = 600 # 超過クライアントを 600 秒 ban
    }
    description = "per-IP rate limit (preview)"
  }
```

Here too, **decide the threshold by observation.** Whether `enforce_on_key` should be `XFF_IP` (the leading IP of `X-Forwarded-For`) rather than `IP`, or `HTTP_COOKIE` (per session), is judged **after seeing the actual state of your traffic in Cloud Logging.** So as not to drag in legitimate users under NAT.

### 3.5 Adaptive Protection (adaptive defense for L7 DDoS)

**Adaptive Protection** is a Cloud Armor feature that protects the app from **L7 DDoS** with machine learning. It can be **enabled per security policy**, learns the baseline of normal-time traffic (officially, baseline established with **at least 1 hour** of learning), and when it detects an anomaly generates an alert including the following.

- **confidence score** (0–1 confidence)
- **attack signature** (a description of the attack traffic's characteristics)
- **suggested rule** (a Cloud Armor WAF rule proposal you can deploy as-is)
- **alert ID**

```bash
# セキュリティポリシーに Adaptive Protection（L7 DDoS 防御）を有効化
gcloud compute security-policies update app-armor-policy \
  --enable-layer7-ddos-defense
```

What's important is to **apply even the "suggested rule" Adaptive Protection emits with preview before enforcing, not enforce it right away.** Even an ML proposal goes through a human confirmation gate. In the broadcaster's project, with the three tiers of this adaptive defense + fixed rate limiting + OWASP CRS, I caught entrance anomalies in multiple layers.

> **The AWS-side counterpart**: L3/L4 DDoS is constantly protected for free by **Shield Standard.** If you need advanced automatic L7 mitigation or specialist-team (SRT) support, **Shield Advanced** (paid). The closest to Cloud Armor's Adaptive Protection is Shield Advanced's automatic application-layer mitigation. The structure of "always-free basic protection + paid advanced protection if needed" is common to both clouds.

---

## 4. A safe rollout: landing Count/preview into "operation"

Now that the technical elements are in place, let me summarize **the operation that doesn't emit false positives** in a stage table. This is the generalization of the steps I actually went through in the broadcaster's project.

| Stage | AWS WAF | Cloud Armor | This stage's goal |
| --- | --- | --- | --- |
| ① staging full enablement | Tie the Web ACL to stg's ALB, all rules Block | Tie the security policy to stg's LB, all rules enforce | **Crush config mistakes, syntax errors, and obvious false positives before production** |
| ② production Count/preview | All managed/custom rules `count {}` / `override_action { count {} }` | All rules `preview = true` | **Observe who gets caught on production traffic** |
| ③ individual tuning | Keep a false-positive rule on Count with `rule_action_override` or exclude it | Exclude the false-positive signature with `opt_out_rule_ids`, adjust sensitivity | **Make dragging-in of legitimate traffic zero** |
| ④ staged enforce | Remove `count {}` from rules you've safely confirmed | To `--no-preview` / `preview = false` | **Promote only verified rules to enforce** |
| ⑤ continuous observation | Constantly monitor CloudWatch metrics + sampling + WAF logs | Constantly monitor deny/throttle in Cloud Logging | **Early discovery of new false positives / new attacks** |

**①'s "staging full enablement" pays off.** In the broadcaster's project, I ran an operation of **fully enabling the WAF in stg and surfacing false positives and config mistakes before putting it into production.** In stg, even dragging in legitimate users has small actual harm, and you can find in advance legitimate requests that resemble attack patterns (e.g. an admin's legitimate operation, a rich JSON body). Only rules "verified in stg + re-confirmed with production Count" are promoted to production.

> **The official warning (AWS)**: "Before deploying to production traffic, **test and tune it in a staging or test environment**, then test it in **count mode** against production traffic before enabling it." — The official docs make the two stages (staging → production count) explicit. My operation is faithful to this.

### Dealing with false positives: exclusion and scope-down

When observation finds false positives, there are two moves.

1. **Exclusion (exclusion / opt_out)**: disable only that rule. For AWS, keep that rule on Count with `rule_action_override`; for Cloud Armor, exclude a specific signature with `evaluatePreconfiguredWaf(..., {'opt_out_rule_ids': ['...']})`. The iron rule is to **not cut the whole rule group, but remove only the one rule that false-positived** (minimal disabling).
2. **Scope-down**: **narrow the rule's application range.** Exempt only the false-positiving path, like "exempt `/api/upload` only from `SizeRestrictions_BODY`." Realize it with AWS's `scope_down_statement` or Cloud Armor's match expression (CEL).

**"Cut the WAF because it false-positived" is the worst move.** Remove it at the minimum unit of one rule / one path, minimizing the defense hole.

---

## 5. Production operation: version management, logs, cost, don't overtrust

### 5.1 Version management of managed rules

AWS's managed rule groups are **version-managed** (obtainable with `DescribeManagedRuleGroup`, with a changelog). **Not fixing the version and following the latest** can suddenly change behavior on an AWS-side update and increase false positives. In production, **explicitly fix the version**, raise it planfully looking at the changelog, and when raising, go through Count → Block again. Cloud Armor's CRS similarly, **make the version explicit in the expression like `v33` / `v422`** (don't implicitly follow the latest). This is the principle of ETC (Easy To Change) — to confine change to a predictable unit.

### 5.2 Logs and observability: connect WAF logs to analysis

A WAF can be operated only **by leaving "blocked / counted" in the logs.**

- **AWS WAF logs**: output to CloudWatch Logs / S3 / Kinesis Data Firehose. **Always enable** sampled requests (`sampled_requests_enabled = true`) and CloudWatch metrics (`cloudwatch_metrics_enabled = true`). Trace which rule took effect with labels (`awswaf:managed:aws:core-rule-set:*`).
- **Cloud Armor**: output to Cloud Logging. Looking at `enforcedSecurityPolicy` and **`previewSecurityPolicyName`** distinguishes "blocked in enforce" from "would have matched if preview."

**The quality of observation decides the safety of the rollout.** Emit with Count while sampling is off, and you can't see "what matched," so you can't judge the move to Block.

### 5.3 Cost

A WAF isn't free. AWS WAF is billed by **the number of Web ACLs + the number of rules + the number of processed requests**, and managed rules (advanced ones like Bot Control / ATP / ACFP) incur additional charges. Cloud Armor too, full alerts of Adaptive Protection etc. are paid at the Enterprise tier.

- **Be conscious of WCU (Web ACL Capacity Unit)**: an AWS Web ACL has a WCU upper limit. CRS alone consumes 700 WCU, so **piling on too many managed rules hits the limit.** Put in only the rules you need (YAGNI).
- **Weigh rate-based / advanced bot countermeasures by effect and cost**: Bot Control and ATP are powerful but incur additional billing. The cost-efficient order is to **first harden the range protectable by the basics (CRS + Known bad inputs + rate-based), and add advanced features only for the shortfall.**

### 5.4 Don't overtrust the WAF — re-confirming defense-in-depth

Finally, back to the first line. **A WAF is one layer.** Attacks that slip past the WAF definitely exist (logic-origin authorization bugs, zero-days, abuse that looks legitimate). So don't omit the inner layers.

- **Authentication / MFA**: suppress bots and credential stuffing with Identity Platform / Cognito MFA and reCAPTCHA Enterprise.
- **Authorization**: enforce "who can access what" at the app / DB (RLS etc.). A WAF doesn't replace this.
- **Input validation**: schema-validate at the boundary with Zod etc. A WAF's OWASP rules are for known patterns, not app-specific validity validation.
- **Least privilege**: keep service accounts / IAM at least privilege. Even if breached, localize the damage.
- **Secrets management**: Secret Manager / environment variables. Mandatory regardless of the WAF.
- **Data layer**: TLS required, private IP, IAM auth (the broadcaster project's Cloud SQL configuration).

What Cloud Armor handled on the broadcaster's platform is this **outermost layer.** It's precisely because the inside (authentication, authorization, validation, least privilege, encryption) is in place that the WAF has meaning. With the order reversed — "I put in a WAF, so the inside can be sloppy" — it's a castle with only a fine gate and no walls.

---

## 6. Summary: a WAF defense-in-depth cheat sheet

A quick-reference table for when you're unsure.

- **Which WAF to use**: under AWS's ALB/CloudFront → **AWS WAF.** Under GCP's external Application Load Balancer → **Cloud Armor.** Multi-cloud, both with the same philosophy.
- **OWASP countermeasures**: AWS is **`AWSManagedRulesCommonRuleSet` (CRS, WCU 700) + `AWSManagedRulesKnownBadInputsRuleSet`.** Cloud Armor is **preconfigured WAF rules like `evaluatePreconfiguredWaf('sqli-v33-stable' ...)`.** Don't write regexes yourself.
- **Rate limiting**: AWS is a **rate-based rule** (window 60/120/300/600, default 300, min limit 10, `IP`/`FORWARDED_IP` aggregation). Cloud Armor is **throttle / rate_based_ban** (`enforce_on_key`, `ban_duration_sec`). Decide the threshold **after observing.**
- **L7 DDoS**: AWS is Shield Standard (free) / Advanced (paid). Cloud Armor is **Adaptive Protection** (ML, per-policy, at least 1 hour of learning, confidence score / suggested rule). Even a suggested rule starts from preview.
- **Safe rollout (most important)**: **① staging full enablement → ② production Count/preview → ③ individual tuning (exclusion, scope-down) → ④ staged enforce → ⑤ continuous observation.** Always start a new rule **from Count (AWS) / preview (Cloud Armor).**
- **Operation**: **fix the version** of managed rules, sampling & metrics always ON, exclude false positives **at the minimum unit of one rule / one path**, be conscious of WCU and cost.
- **Defense-in-depth**: a WAF is **one layer.** Don't omit authentication, MFA, authorization, input validation, least privilege, Secrets, and the data layer.

A WAF is not a device that's "safe if you put it in," but **the design of settings and rollout that "rejects only attacks without dragging in legitimate traffic."** The biggest risk isn't attacks but your own false positives — that's exactly why you place Count/preview at the center of operation.

I built an internal AI platform for a major domestic broadcaster with **100% Terraform (about 71 modules)** and placed **Cloud Armor (OWASP CRS 3.3 + adaptive DDoS protection + rate limiting)** at the entrance. And with an operation of **fully enabling the WAF in stg and surfacing false positives and config mistakes before putting it into production**, I promoted defense layers without dragging in legitimate traffic. The WAF is "the filter at the entrance" there, and inside it I layered Cloud SQL with IAM auth, TLS required, and private IP, Secret Manager, least-privilege service accounts, and SMS MFA + reCAPTCHA Enterprise — making it function as **one layer of defense-in-depth.**

**"How do I design a WAF for my app, and how do I put it into production without emitting false positives?" — from that design through Terraform implementation, Count/preview rollout, and operation design, I can accompany you fast and safely, one person × generative AI (Claude Code).** Even from the requirements-organizing stage, feel free to consult me.

---

### Reference (official documentation)

- [What are AWS WAF, AWS Shield, and AWS Firewall Manager?](https://docs.aws.amazon.com/waf/latest/developerguide/what-is-aws-waf.html) — Web ACL, rule actions (Allow/Block/Count/CAPTCHA), protected resources
- [AWS Managed Rules rule groups list](https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-list.html) — the list of managed rule groups
- [Baseline rule groups (Core rule set and others)](https://docs.aws.amazon.com/waf/latest/developerguide/aws-managed-rule-groups-baseline.html) — the contents of `AWSManagedRulesCommonRuleSet` (WCU 700) / `AWSManagedRulesKnownBadInputsRuleSet`
- [Using rate-based rule statements in AWS WAF](https://docs.aws.amazon.com/waf/latest/developerguide/waf-rule-statement-type-rate-based.html) — `RateBasedStatement`, evaluation window, aggregation key
- [Testing and tuning your AWS WAF protections](https://docs.aws.amazon.com/waf/latest/developerguide/web-acl-testing.html) — the safe rollout of staging → production count mode
- [Google Cloud Armor overview](https://cloud.google.com/armor/docs/cloud-armor-overview) — security policy, preview, Adaptive Protection
- [Cloud Armor preconfigured WAF rules (OWASP CRS)](https://cloud.google.com/armor/docs/waf-rules) — `evaluatePreconfiguredWaf('sqli-v33-stable' ...)`, sensitivity levels 0–4
- [Cloud Armor rate limiting overview](https://cloud.google.com/armor/docs/rate-limiting-overview) — throttle / rate_based_ban, `enforce_on_key`, `ban_duration_sec`
- [Cloud Armor Adaptive Protection overview](https://cloud.google.com/armor/docs/adaptive-protection-overview) — ML-based L7 DDoS detection, confidence score, suggested rule
