Skip to main content
友田 陽大
Infrastructure, IaC & CI/CD
セキュリティ
AWS
GCP
WAF
アーキテクチャ設計

Designing Defense-in-Depth with a WAF: Rolling Out AWS WAF / Cloud Armor's OWASP Countermeasures, Rate Limiting, and DDoS Mitigation to Production Without False Positives

An implementation guide for building defense-in-depth in production with AWS WAF and Google Cloud Armor. Explained with real settings: Web ACLs / security policies, OWASP managed rules, rate limiting, DDoS / adaptive protection, and the operation of safely rolling out with count/staging without false positives. A WAF is 'one layer' of defense-in-depth, not a silver bullet — designed from that premise.

Published
Reading time
23 min read
Author
友田 陽大
Share

"As long as I put in a WAF, it's safe, right?" — at a project meeting, it's one of the most common and most dangerous questions.

A WAF (Web Application Firewall) is indeed powerful. SQL injection, XSS, and known attack patterns can be rejected at the entrance before the request reaches the app. But a WAF is not a "silver bullet." Authorization (who can do what), input validation (whether the received value is valid), and least privilege — a WAF replaces none of these. A WAF is just one layer of defense-in-depth — an L7 request filter.

And there's another truth that pays off precisely in the field. A WAF's biggest operational risk is not "failing to block attacks" but "mistakenly blocking legitimate traffic." The payment button returns 403, an API client is rejected, a search query is misjudged as SQLi — these happen with far higher probability than attacks, and by your own hand at that.

This article is an implementation guide for designing and operating defense-in-depth at production quality with AWS WAF and Google Cloud Armor. As the subject matter, I'll weave in design decisions from an internal AI platform I built for a major domestic broadcaster — placing Cloud Armor (OWASP CRS 3.3 + adaptive DDoS protection + rate limiting) at the entrance, and an operation of fully enabling the WAF in staging to crush false positives before production.

The rule of this article: The specs, settings, and rule names are based on the AWS / Google Cloud official documentation (as of June 2026). Because managed-rule versions and pricing get revised, always confirm the latest official information before going to production. The code is shaped into a form usable in real operations (Terraform-centric), but a WAF is one layer of defense-in-depth and doesn't replace authorization, input validation, or least privilege. And one more iron rule — always start a new rule from Count (observe), and Block (enforce) comes after.


0. Mental model: a WAF is "an L7 request filter"

Before starting the design, let me pin down in one line what a WAF is and isn't.

WAF = an L7 filter that evaluates HTTP/HTTPS requests by content (SQLi/XSS/known attack patterns) and rate (abnormal frequency), and Allows / Blocks / Counts / Challenges at the entrance.

Three consequences emerge from here.

  1. A WAF only sees "the content of the communication." Authorization logic (may this user access this resource) is the app's job. A WAF only judges "whether a request matches an attack pattern." A WAF can't distinguish a legitimate admin access to /admin from an attacker's access.
  2. A WAF is strong against "known patterns" and weak against "logic holes." OWASP managed rules comprehensively reject known attacks, but they pass your app-specific authorization bugs (IDOR, privilege escalation) straight through. So input validation and least privilege are separately needed.
  3. That's exactly why defense-in-depth. WAF (entrance) + authentication / MFA + authorization + input validation (Zod, etc.) + least-privilege IAM + Secrets management + network isolation. The WAF is this outermost layer, and it's no reason to omit the inner layers.

On the broadcaster's platform too, Cloud Armor is just one layer at the entrance, and inside it layers Cloud SQL (IAM auth, TLS required, private IP), Secret Manager, least-privilege service accounts, and Identity Platform (SMS MFA, reCAPTCHA Enterprise). The WAF is the filter at that entrance.


1. The map of the two major WAFs: AWS WAF and Cloud Armor

Different clouds mean different WAF structures and terms. First, grasp the big picture with a correspondence table.

ConceptAWS WAFGoogle Cloud Armor
Container for rulesWeb ACL (protection pack)security policy
Application targetCloudFront / ALB / API Gateway / AppSync / Cognito user pool / App Runner / Verified Access / AmplifyExternal Application Load Balancer (backend / edge security policy)
OWASP countermeasuresAWSManagedRulesCommonRuleSet (CRS, WCU 700)preconfigured WAF rules (OWASP ModSecurity CRS 3.3 / 4.22)
Rule actionsAllow / Block / Count / CAPTCHA / Challengeallow / deny / throttle / rate_based_ban / redirect
Rate limitingrate-based rule (RateBasedStatement)rateLimitOptions (throttle / rate_based_ban)
Observe mode (surfacing false positives)Count action / rule action overridepreview flag
DDoS (L7)Shield Standard (standard) / Shield Advanced (paid)Adaptive Protection (ML-based L7 detection)
LogsWAF logs → CloudWatch / S3 / FirehoseCloud Logging

Which to use is almost decided by "which cloud's load balancer the app is under." Under AWS's ALB/CloudFront, AWS WAF; under GCP's external Application Load Balancer, Cloud Armor. If multi-cloud, you assemble both with the same philosophy (OWASP managed + rate limiting + Count/preview first). This article handles both.


2. AWS WAF: assembling a Web ACL and managed rules with Terraform

2.1 What is a Web ACL (the official definition)

AWS WAF is a web application firewall that monitors HTTP/HTTPS requests forwarded to the protected resource. The container for requests is the Web ACL (web access control list), and the resources it can protect are, officially, as follows.

  • Amazon CloudFront distribution
  • Amazon API Gateway REST API
  • Application Load Balancer
  • AWS AppSync GraphQL API
  • Amazon Cognito user pool
  • AWS App Runner service
  • AWS Verified Access instance
  • AWS Amplify

A Web ACL has a default action (what to do with requests that didn't match = allow or block), and within it arranges rules in priority order. For a public-facing site, "Allow all except…" = default Allow + Block attacks is the basic form.

The rule actions the official docs define are the next 4 families. Accurately pinning down this is the starting point of a safe rollout.

ActionMeaning
AllowLet a matched request through
BlockReject a matched request with 403 or a custom response
CountWithout changing handling, just count the number of matches (for observation / testing)
CAPTCHA / ChallengeSuppress bots with a CAPTCHA / silent challenge

How the official docs explain Count: "The Count action can be used to count matched requests without changing how they're handled. It can also be used to test a new rule. When you want to allow/block based on a new characteristic, you can first count with Count, confirm the configuration is correct, and then switch to allow/block." — This is the backbone of this article.

2.2 Managed rules: the main force of OWASP countermeasures

Writing SQLi/XSS regexes yourself is a foolish move (a DRY violation, and unmaintainable). AWS provides managed rule groups, and the main force is the following three baseline ones.

Rule groupVendorName / NameWCUPurpose
Core rule set (CRS)AWS / AWSManagedRulesCommonRuleSet700Broad vulnerability countermeasures including OWASP Top 10 (recommended for all WAF use cases)
Known bad inputsAWS / AWSManagedRulesKnownBadInputsRuleSet200Block known malicious patterns like Log4j (CVE-2021-44228 and others)
Admin protectionAWS / AWSManagedRulesAdminProtectionRuleSet100Block external access to exposed admin-panel paths

CRS's contents are composed of specific names (in part). All default to Block and attach labels (awswaf:managed:aws:core-rule-set:*).

  • NoUserAgent_HEADER — block a missing User-Agent header
  • SizeRestrictions_BODY — block a request body over 8KB (8,192 bytes)
  • SizeRestrictions_QUERYSTRING — block a query string over 2,048 bytes
  • CrossSiteScripting_BODY / _QUERYARGUMENTS / _COOKIE / _URIPATH — XSS patterns
  • GenericLFI_* / GenericRFI_* — local/remote file inclusion (../../ etc.)
  • EC2MetaDataSSRF_* — EC2 metadata theft (SSRF)

Known bad inputs includes Log4JRCE_HEADER/BODY/URIPATH/QUERYSTRING (detects ${jndi:ldap://…}), JavaDeserializationRCE_*, and so on. These are the textbook of "rules you shouldn't write yourself," and AWS version-manages them.

2.3 Terraform: emit with Count, then move to Block

This is the core of this article. Don't put managed rules into production with Block right away. SizeRestrictions_BODY rejects bodies over 8KB unconditionally — your file-upload API or rich JSON payloads might start returning 403 from that day.

So first apply it to production traffic with Count, and observe "what gets caught" with sampled requests and CloudWatch metrics.

Step 1: observe managed rules with Count

resource "aws_wafv2_web_acl" "app" {
  name        = "app-web-acl"
  description = "Defense-in-depth WAF for the app (count-first rollout)"
  scope       = "REGIONAL" # ALB / API Gateway 用。CloudFront は "CLOUDFRONT"

  # デフォルトは「許可」。公衆向けサイトの基本形(指定したものだけ弾く)
  default_action {
    allow {}
  }

  # --- OWASP Core Rule Set ---
  rule {
    name     = "AWSManagedRulesCommonRuleSet"
    priority = 10

    # rule group 自体は「ブロックを上書きしない」(none) が、
    # 個別ルールを Count に上書きして観測する = 誤検知の洗い出し
    override_action {
      none {}
    }

    statement {
      managed_rule_group_statement {
        vendor_name = "AWS"
        name        = "AWSManagedRulesCommonRuleSet"

        # 誤爆しやすい個別ルールだけ Count に落として様子を見る
        rule_action_override {
          name = "SizeRestrictions_BODY" # 8KB超ボディ。アップロードAPIで誤爆しがち
          action_to_use {
            count {}
          }
        }
        rule_action_override {
          name = "CrossSiteScripting_BODY" # リッチエディタ等で誤爆しがち
          action_to_use {
            count {}
          }
        }
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "CommonRuleSet"
      sampled_requests_enabled   = true # サンプリングされたリクエストを必ず有効に
    }
  }

  # --- Known bad inputs(Log4j 等。これはほぼ誤爆しないので最初から有効寄り)---
  rule {
    name     = "AWSManagedRulesKnownBadInputsRuleSet"
    priority = 20
    override_action {
      none {}
    }
    statement {
      managed_rule_group_statement {
        vendor_name = "AWS"
        name        = "AWSManagedRulesKnownBadInputsRuleSet"
      }
    }
    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "KnownBadInputs"
      sampled_requests_enabled   = true
    }
  }

  visibility_config {
    cloudwatch_metrics_enabled = true
    metric_name                = "appWebAcl"
    sampled_requests_enabled   = true
  }

  tags = { Environment = "staging" }
}

The point is rule_action_override (the API's RuleActionOverrides). While keeping the rule group as a whole enabled, you can "override" only individual rules you fear false positives on to Count. Observe that rule's match count in CloudWatch for a few days, and once you can confirm "legitimate traffic isn't getting caught," remove count {} to return it to Block (the original action).

The difference between override_action and rule_action_override (beware confusion): override_action { none {} } is "use the rule group's actions as-is (don't override)," and override_action { count {} } is "override all rules in the group to Count." On the other hand, rule_action_override is "override per individual rule." A two-tier setup is safe: first observe the whole group with override_action { count {} } → once used to it, Count only the dangerous rules individually with rule_action_override.

Step 2: add rate limiting (rate-based rule)

OWASP rules see "content," but brute force and scraping are a problem of "frequency." Control frequency with a rate-based rule. The official high-level settings are as follows.

  • Evaluation window: choose from 60 / 120 / 300 / 600 seconds. The default is 300 (5 minutes). "How many seconds back from the current time to count."
  • Rate limit: the upper limit of requests within that window. The minimum is 10. Exceed it and the rule action is applied to subsequent matched requests.
  • Aggregation (aggregation key): IP (source IP) / FORWARDED_IP (the leading IP of X-Forwarded-For etc.) / CONSTANT (all requests in a batch = Count all) / CUSTOM_KEYS.
  • Action: any action other than Allow (= Block / Count / Challenge).
  # web_acl リソースに追記する rule ブロック
  rule {
    name     = "RateLimitPerIP"
    priority = 5 # マネージドルールより前に評価したいので小さい優先度

    # ★ まずは Count。閾値が妥当か観測してから block に変える
    action {
      count {}
    }

    statement {
      rate_based_statement {
        limit                 = 2000 # 5分ウィンドウ・1IPあたり2000req。最小は10
        aggregate_key_type    = "IP"
        evaluation_window_sec = 300 # 60/120/300/600。デフォルト300

        # scope-down: 重い検索エンドポイントだけに絞ることも可能
        scope_down_statement {
          byte_match_statement {
            field_to_match { uri_path {} }
            positional_constraint = "STARTS_WITH"
            search_string         = "/api/search"
            text_transformation {
              priority = 0
              type     = "NONE"
            }
          }
        }
      }
    }

    visibility_config {
      cloudwatch_metrics_enabled = true
      metric_name                = "RateLimitPerIP"
      sampled_requests_enabled   = true
    }
  }

How to decide the threshold (limit) is "after observing." Measure the legitimate users' normal-time peak req/IP/5min in CloudWatch and place its 2–3× as the initial threshold. Set it to Block from the start and, in cases where many users appear as the same IP under a CDN/proxy (office NAT, mobile-carrier CGNAT), you drag in legitimate users. So emit it with action { count {} }, confirm zero false positives, then change it to block {}.

The effect of scope-down (SRP): applying rate limiting not to all paths but only to heavy endpoints (search, login, generation APIs) avoids dragging in static assets and light GETs. "One rule handles only one concern" — scope_down_statement realizes that.

Step 3: switch to Block once observation is done

Observe with Count for a few days to two weeks, and once you confirm "sampled requests don't include legitimate traffic," just return count {} to the original action.

  # RateLimitPerIP の action を Count から Block に
  rule {
    name     = "RateLimitPerIP"
    priority = 5
    action {
      block {} # 観測完了 → 強制へ
    }
    # statement / visibility_config は同じ
  }

This is the safe rollout of Count (observe) → Block (enforce). The diff is one line — but before that one line, always place the "evidence" of observation (verification first).


3. Google Cloud Armor: security policy and preconfigured WAF rules

3.1 What is a security policy

Cloud Armor's security policy is a configuration unit that protects an app under a load balancer from DDoS and web attacks. It has prioritized rules (match condition + action).

  • backend security policy: a policy tied to a backend service (OWASP rules, rate limiting, Adaptive Protection go here).
  • edge security policy: a policy that filters at the edge in front of the CDN cache.
  • The application target is an external Application Load Balancer (including classic).

A rule's action is allow / deny(403|404|502) / throttle / rate_based_ban / redirect. And the most important — the preview flag.

preview = Cloud Armor's version of Count. Create a rule with --preview and the match is recorded in logs but isn't actually blocked. With the same philosophy as AWS's Count, you can observe "what matches" before enforcing it in production.

3.2 preconfigured WAF rules (OWASP CRS)

Cloud Armor provides the OWASP ModSecurity Core Rule Set as predefined rules. The available CRS versions are 4.22 / 3.3 / 3.0 (3.0 is deprecated). In the broadcaster's project, I adopted CRS 3.3. Rules are referenced with the evaluatePreconfiguredWaf() expression (representative examples of the CRS 3.3 family).

Predefined ruleWhat it defends against
sqli-v33-stableSQL injection
xss-v33-stableCross-site scripting
lfi-v33-stableLocal file inclusion
rfi-v33-stableRemote file inclusion
rce-v33-stableRemote code execution
scannerdetection-v33-stableScanner detection
protocolattack-v33-stableProtocol attacks
sessionfixation-v33-stableSession fixation

The sensitivity level is 0–4. Officially, "by default Cloud Armor operates at sensitivity level 4, and after enablement evaluates all signatures in the rule set." The higher the sensitivity, the wider the detection, but the more false positives — that's exactly why preview is needed.

3.3 gcloud / Terraform: emit with preview, then enforce

Add OWASP rules with preview (observe)

# セキュリティポリシーを作成
gcloud compute security-policies create app-armor-policy \
  --description "Defense-in-depth WAF (preview-first rollout)"

# OWASP SQLi ルールを *preview* で追加(マッチはログるが弾かない)
gcloud compute security-policies rules create 1000 \
  --security-policy app-armor-policy \
  --description "OWASP SQLi (CRS 3.3) - PREVIEW" \
  --expression "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1})" \
  --action deny-403 \
  --preview   # ★ ここが肝。preview = 観測のみ、ブロックしない

# XSS も同様に preview で
gcloud compute security-policies rules create 1010 \
  --security-policy app-armor-policy \
  --expression "evaluatePreconfiguredWaf('xss-v33-stable', {'sensitivity': 1})" \
  --action deny-403 \
  --preview

Note {'sensitivity': 1}. Don't start sensitivity at 4 (all signatures) from the start; start low (1), observe false positives with preview, and raise it if needed — that's the safe side. Scrutinize requests where previewSecurityPolicyName matched in Cloud Logging, and once you confirm legitimate traffic isn't included, remove preview.

# 観測完了 → preview を解除して enforce(強制)に切り替え
gcloud compute security-policies rules update 1000 \
  --security-policy app-armor-policy \
  --no-preview

The same philosophy in Terraform too (preview = true → false)

resource "google_compute_security_policy" "app" {
  name        = "app-armor-policy"
  description = "Defense-in-depth WAF (preview-first)"

  # OWASP SQLi(CRS 3.3)— まずは preview で観測
  rule {
    action   = "deny(403)"
    priority = 1000
    preview  = true # ★ 観測モード。enforce 前に誤検知を洗い出す
    match {
      expr {
        expression = "evaluatePreconfiguredWaf('sqli-v33-stable', {'sensitivity': 1})"
      }
    }
    description = "OWASP CRS 3.3 SQLi (preview)"
  }

  # デフォルトルール:マッチしないものは許可(優先度は最大値固定)
  rule {
    action   = "allow"
    priority = 2147483647
    match {
      versioned_expr = "SRC_IPS_V1"
      config {
        src_ip_ranges = ["*"]
      }
    }
    description = "default allow"
  }
}

3.4 Rate limiting (throttle / rate_based_ban)

Cloud Armor's rate limiting has two actions.

  • throttle: let through up to the threshold, and reject the excess with exceed_action (deny(429) etc.).
  • rate_based_ban: ban for a fixed period a client that exceeds the threshold.
  # レート制限ルール(IPごと・1分100リクエスト超で10分 ban)
  rule {
    action   = "rate_based_ban"
    priority = 900
    preview  = true # ★ 閾値が妥当か観測してから enforce
    match {
      versioned_expr = "SRC_IPS_V1"
      config { src_ip_ranges = ["*"] }
    }
    rate_limit_options {
      enforce_on_key = "IP" # ALL / IP / HTTP_HEADER / HTTP_COOKIE / XFF_IP / ...
      conform_action = "allow"
      exceed_action  = "deny(429)"
      rate_limit_threshold {
        count        = 100
        interval_sec = 60 # 10/30/60/120/.../3600 から
      }
      ban_duration_sec = 600 # 超過クライアントを 600 秒 ban
    }
    description = "per-IP rate limit (preview)"
  }

Here too, decide the threshold by observation. Whether enforce_on_key should be XFF_IP (the leading IP of X-Forwarded-For) rather than IP, or HTTP_COOKIE (per session), is judged after seeing the actual state of your traffic in Cloud Logging. So as not to drag in legitimate users under NAT.

3.5 Adaptive Protection (adaptive defense for L7 DDoS)

Adaptive Protection is a Cloud Armor feature that protects the app from L7 DDoS with machine learning. It can be enabled per security policy, learns the baseline of normal-time traffic (officially, baseline established with at least 1 hour of learning), and when it detects an anomaly generates an alert including the following.

  • confidence score (0–1 confidence)
  • attack signature (a description of the attack traffic's characteristics)
  • suggested rule (a Cloud Armor WAF rule proposal you can deploy as-is)
  • alert ID
# セキュリティポリシーに Adaptive Protection(L7 DDoS 防御)を有効化
gcloud compute security-policies update app-armor-policy \
  --enable-layer7-ddos-defense

What's important is to apply even the "suggested rule" Adaptive Protection emits with preview before enforcing, not enforce it right away. Even an ML proposal goes through a human confirmation gate. In the broadcaster's project, with the three tiers of this adaptive defense + fixed rate limiting + OWASP CRS, I caught entrance anomalies in multiple layers.

The AWS-side counterpart: L3/L4 DDoS is constantly protected for free by Shield Standard. If you need advanced automatic L7 mitigation or specialist-team (SRT) support, Shield Advanced (paid). The closest to Cloud Armor's Adaptive Protection is Shield Advanced's automatic application-layer mitigation. The structure of "always-free basic protection + paid advanced protection if needed" is common to both clouds.


4. A safe rollout: landing Count/preview into "operation"

Now that the technical elements are in place, let me summarize the operation that doesn't emit false positives in a stage table. This is the generalization of the steps I actually went through in the broadcaster's project.

StageAWS WAFCloud ArmorThis stage's goal
① staging full enablementTie the Web ACL to stg's ALB, all rules BlockTie the security policy to stg's LB, all rules enforceCrush config mistakes, syntax errors, and obvious false positives before production
② production Count/previewAll managed/custom rules count {} / override_action { count {} }All rules preview = trueObserve who gets caught on production traffic
③ individual tuningKeep a false-positive rule on Count with rule_action_override or exclude itExclude the false-positive signature with opt_out_rule_ids, adjust sensitivityMake dragging-in of legitimate traffic zero
④ staged enforceRemove count {} from rules you've safely confirmedTo --no-preview / preview = falsePromote only verified rules to enforce
⑤ continuous observationConstantly monitor CloudWatch metrics + sampling + WAF logsConstantly monitor deny/throttle in Cloud LoggingEarly discovery of new false positives / new attacks

①'s "staging full enablement" pays off. In the broadcaster's project, I ran an operation of fully enabling the WAF in stg and surfacing false positives and config mistakes before putting it into production. In stg, even dragging in legitimate users has small actual harm, and you can find in advance legitimate requests that resemble attack patterns (e.g. an admin's legitimate operation, a rich JSON body). Only rules "verified in stg + re-confirmed with production Count" are promoted to production.

The official warning (AWS): "Before deploying to production traffic, test and tune it in a staging or test environment, then test it in count mode against production traffic before enabling it." — The official docs make the two stages (staging → production count) explicit. My operation is faithful to this.

Dealing with false positives: exclusion and scope-down

When observation finds false positives, there are two moves.

  1. Exclusion (exclusion / opt_out): disable only that rule. For AWS, keep that rule on Count with rule_action_override; for Cloud Armor, exclude a specific signature with evaluatePreconfiguredWaf(..., {'opt_out_rule_ids': ['...']}). The iron rule is to not cut the whole rule group, but remove only the one rule that false-positived (minimal disabling).
  2. Scope-down: narrow the rule's application range. Exempt only the false-positiving path, like "exempt /api/upload only from SizeRestrictions_BODY." Realize it with AWS's scope_down_statement or Cloud Armor's match expression (CEL).

"Cut the WAF because it false-positived" is the worst move. Remove it at the minimum unit of one rule / one path, minimizing the defense hole.


5. Production operation: version management, logs, cost, don't overtrust

5.1 Version management of managed rules

AWS's managed rule groups are version-managed (obtainable with DescribeManagedRuleGroup, with a changelog). Not fixing the version and following the latest can suddenly change behavior on an AWS-side update and increase false positives. In production, explicitly fix the version, raise it planfully looking at the changelog, and when raising, go through Count → Block again. Cloud Armor's CRS similarly, make the version explicit in the expression like v33 / v422 (don't implicitly follow the latest). This is the principle of ETC (Easy To Change) — to confine change to a predictable unit.

5.2 Logs and observability: connect WAF logs to analysis

A WAF can be operated only by leaving "blocked / counted" in the logs.

  • AWS WAF logs: output to CloudWatch Logs / S3 / Kinesis Data Firehose. Always enable sampled requests (sampled_requests_enabled = true) and CloudWatch metrics (cloudwatch_metrics_enabled = true). Trace which rule took effect with labels (awswaf:managed:aws:core-rule-set:*).
  • Cloud Armor: output to Cloud Logging. Looking at enforcedSecurityPolicy and previewSecurityPolicyName distinguishes "blocked in enforce" from "would have matched if preview."

The quality of observation decides the safety of the rollout. Emit with Count while sampling is off, and you can't see "what matched," so you can't judge the move to Block.

5.3 Cost

A WAF isn't free. AWS WAF is billed by the number of Web ACLs + the number of rules + the number of processed requests, and managed rules (advanced ones like Bot Control / ATP / ACFP) incur additional charges. Cloud Armor too, full alerts of Adaptive Protection etc. are paid at the Enterprise tier.

  • Be conscious of WCU (Web ACL Capacity Unit): an AWS Web ACL has a WCU upper limit. CRS alone consumes 700 WCU, so piling on too many managed rules hits the limit. Put in only the rules you need (YAGNI).
  • Weigh rate-based / advanced bot countermeasures by effect and cost: Bot Control and ATP are powerful but incur additional billing. The cost-efficient order is to first harden the range protectable by the basics (CRS + Known bad inputs + rate-based), and add advanced features only for the shortfall.

5.4 Don't overtrust the WAF — re-confirming defense-in-depth

Finally, back to the first line. A WAF is one layer. Attacks that slip past the WAF definitely exist (logic-origin authorization bugs, zero-days, abuse that looks legitimate). So don't omit the inner layers.

  • Authentication / MFA: suppress bots and credential stuffing with Identity Platform / Cognito MFA and reCAPTCHA Enterprise.
  • Authorization: enforce "who can access what" at the app / DB (RLS etc.). A WAF doesn't replace this.
  • Input validation: schema-validate at the boundary with Zod etc. A WAF's OWASP rules are for known patterns, not app-specific validity validation.
  • Least privilege: keep service accounts / IAM at least privilege. Even if breached, localize the damage.
  • Secrets management: Secret Manager / environment variables. Mandatory regardless of the WAF.
  • Data layer: TLS required, private IP, IAM auth (the broadcaster project's Cloud SQL configuration).

What Cloud Armor handled on the broadcaster's platform is this outermost layer. It's precisely because the inside (authentication, authorization, validation, least privilege, encryption) is in place that the WAF has meaning. With the order reversed — "I put in a WAF, so the inside can be sloppy" — it's a castle with only a fine gate and no walls.


6. Summary: a WAF defense-in-depth cheat sheet

A quick-reference table for when you're unsure.

  • Which WAF to use: under AWS's ALB/CloudFront → AWS WAF. Under GCP's external Application Load Balancer → Cloud Armor. Multi-cloud, both with the same philosophy.
  • OWASP countermeasures: AWS is AWSManagedRulesCommonRuleSet (CRS, WCU 700) + AWSManagedRulesKnownBadInputsRuleSet. Cloud Armor is preconfigured WAF rules like evaluatePreconfiguredWaf('sqli-v33-stable' ...). Don't write regexes yourself.
  • Rate limiting: AWS is a rate-based rule (window 60/120/300/600, default 300, min limit 10, IP/FORWARDED_IP aggregation). Cloud Armor is throttle / rate_based_ban (enforce_on_key, ban_duration_sec). Decide the threshold after observing.
  • L7 DDoS: AWS is Shield Standard (free) / Advanced (paid). Cloud Armor is Adaptive Protection (ML, per-policy, at least 1 hour of learning, confidence score / suggested rule). Even a suggested rule starts from preview.
  • Safe rollout (most important): ① staging full enablement → ② production Count/preview → ③ individual tuning (exclusion, scope-down) → ④ staged enforce → ⑤ continuous observation. Always start a new rule from Count (AWS) / preview (Cloud Armor).
  • Operation: fix the version of managed rules, sampling & metrics always ON, exclude false positives at the minimum unit of one rule / one path, be conscious of WCU and cost.
  • Defense-in-depth: a WAF is one layer. Don't omit authentication, MFA, authorization, input validation, least privilege, Secrets, and the data layer.

A WAF is not a device that's "safe if you put it in," but the design of settings and rollout that "rejects only attacks without dragging in legitimate traffic." The biggest risk isn't attacks but your own false positives — that's exactly why you place Count/preview at the center of operation.

I built an internal AI platform for a major domestic broadcaster with 100% Terraform (about 71 modules) and placed Cloud Armor (OWASP CRS 3.3 + adaptive DDoS protection + rate limiting) at the entrance. And with an operation of fully enabling the WAF in stg and surfacing false positives and config mistakes before putting it into production, I promoted defense layers without dragging in legitimate traffic. The WAF is "the filter at the entrance" there, and inside it I layered Cloud SQL with IAM auth, TLS required, and private IP, Secret Manager, least-privilege service accounts, and SMS MFA + reCAPTCHA Enterprise — making it function as one layer of defense-in-depth.

"How do I design a WAF for my app, and how do I put it into production without emitting false positives?" — from that design through Terraform implementation, Count/preview rollout, and operation design, I can accompany you fast and safely, one person × generative AI (Claude Code). Even from the requirements-organizing stage, feel free to consult me.


Reference (official documentation)

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading