"GuardDuty — if I just enable it, it protects me, right?" — one of the most common misunderstandings in places where I'm consulted about AWS security.
Half right, half dangerous. Certainly GuardDuty starts monitoring threats across all of AWS, agentless, the moment you enable it. But GuardDuty is a "detection" service, not "prevention." Stopping an attack is not GuardDuty's job. GuardDuty only notifies that "something dangerous is happening now" — and unless you design what to do in response to that notification (finding), GuardDuty is a device that just lights the dashboard red.
This article is an implementation guide for designing and operating AWS threat detection at production quality with Amazon GuardDuty. From single-account enablement, to org-wide control with AWS Organizations, choosing protection plans matched to your assets, and — most importantly — "receiving a finding with EventBridge and turning it into idempotent automated response" — I run through it all with real Terraform and Python code. As a subject, I also weave in my experience cross-implementing IAM, observability, and DR on a serverless payment platform atop multi-account AWS — because it handled actual money, carbon credits, and local currencies, I needed to detect credential leaks and abnormal API operations with a mechanism, not "operational attentiveness."
The rule of this article: the spec, data sources, finding severity, and pricing system are based on the AWS official documentation (as of June 2026). Protection-plan support, finding types, and pricing get revised, so always confirm the latest official information before shipping to production. The code is shaped close to real operation (Terraform / Python), but GuardDuty is a detection service, and does not replace WAF, least-privilege IAM, input validation, or encryption. And one more iron rule — start automated response from what satisfies "idempotent, scope-narrowed, reversible," and interpose a human review for destructive operations.
0. The Mental Model: GuardDuty Is "a Threat-Detection Engine That Analyzes Logs"
Before starting the design, fix in one line what GuardDuty is and isn't.
GuardDuty = an agentless detection engine that continuously ingests AWS logs (CloudTrail, VPC Flow Logs, DNS, etc.), detects "malicious / abnormal behavior" with threat intelligence and ML, and generates findings.
From here, 3 consequences emerge. These become the foundation of all design judgment.
- GuardDuty is "detection," not "prevention." GuardDuty doesn't block requests (that's WAF's job). It's a service that notifies what's happening, not stops the attack. So GuardDuty's value is decided by "detection accuracy × post-detection response speed." Leave findings unattended and the value is zero.
- GuardDuty is agentless and demands no log plumbing. For the foundational data sources (CloudTrail management events / VPC Flow Logs / DNS logs), GuardDuty directly consumes an "independent and duplicated stream." Borrowing the official expression, "an independent and duplicated stream of events." That is, you don't need to create a CloudTrail trail, enable VPC Flow Logs, or store and pay for these. After GuardDuty extracts the fields and profiles them, it discards the log body (doesn't store it). Changing your log settings doesn't affect GuardDuty. The reverse too.
- A finding is "a starting point," not "an endpoint." What GuardDuty emits is a structured notification. Only by receiving it with EventBridge and connecting "response plumbing" for triage, automated response, and ticketing does detection become operational. The climax of this article is Chapter 6 (EventBridge → automated response).
Pin down these 3 points and the task of "putting in GuardDuty" turns out to actually be the 3 designs of "① which logs to show it (protection-plan selection) → ② how to spread it org-wide (org control) → ③ how to turn findings into action (automated response)." Let me build them in order.
1. The Whole Map of GuardDuty: Foundational Detection + Protection Plans + Extended Threat Detection
GuardDuty isn't a "monolithic service" but a 3-layer structure of an always-on foundation + protection plans you add optionally + an upper layer that correlates for free.
1.1 Foundational Data Sources (Enable = Immediately On, No Extra Cost)
Enable GuardDuty and, with no additional config, it starts analyzing the following 3.
| Foundational data source | What it sees | Representative detection examples |
|---|---|---|
| CloudTrail management events | Who hit which AWS API, from where, when (control-plane operations) | Abuse of leaked credentials, abnormal IAM operations, privilege escalation |
| VPC Flow Logs | IP traffic in/out of an EC2 network interface | Communication with known malicious IPs, port scans, C&C communication |
| Route 53 Resolver DNS query logs | DNS name resolutions an EC2 looked up | Queries to malicious domains, crypto mining, DGA |
An operational caution (DNS): DNS detection works only when you use the AWS DNS resolver (Route 53 Resolver, the EC2 default). Use OpenDNS, Google DNS, or your own resolver, and GuardDuty can't reference this data source.
1.2 Protection Plans (Add Only When There's an Asset)
Outside the foundation are optional protection plans that additionally widen the monitoring target. GuardDuty officially calls these collectively "Features."
| Protection plan | The additional logs/targets it sees | When to enable |
|---|---|---|
| S3 Protection | CloudTrail's S3 data events | Storing confidential data in S3 (recommended almost always) |
| EKS Protection | EKS audit logs (control plane) | Operating an EKS cluster |
| Runtime Monitoring | Processes / system calls inside the container/host (EKS / EC2 / ECS-Fargate, a lightweight agent) | Want to see runtime compromise (highest cost) |
| Malware Protection for EC2 | Agentlessly scans the EBS volume attached to the EC2/container | Want to confirm malware on an EC2-origin finding |
| Malware Protection for S3 | Scans newly-uploaded S3 objects | Have an S3 bucket that accepts uploads |
| RDS Protection | Aurora / RDS login activity | Want to detect abnormal DB login attempts |
| Lambda Protection | Lambda's network activity (VPC Flow Logs) | Want to monitor Lambda's outbound communication |
The crux of design is "add to match the assets" (YAGNI). Enable EKS Protection without operating EKS and nothing is detected. Conversely, S3 Protection is recommended for almost all accounts handling confidential data — it also pays off so the later Extended Threat Detection can bundle attack sequences against S3. Because cost is proportional to enabled plans × monitored volume, protection-plan selection is effectively "the allocation of the security budget" (Chapter 8 handles pricing).
A worth-knowing exception: Malware Protection for S3 can be used standalone, without enabling GuardDuty proper. It answers the use case "I just want to scan uploaded files" at minimal cost.
1.3 Extended Threat Detection (Free, Automatic, the Strongest Upper Layer)
Here is GuardDuty's true value. Extended Threat Detection (ETD) turns on automatically, at zero extra cost, when you enable GuardDuty, and correlates individual weak signals to bundle a "multi-stage attack (attack sequence)" into one finding.
In the official words, GuardDuty treats API operations and individual findings as Signals, and, including weak signals that don't look like a threat alone, detects patterns it aligns as an "attack chain" within a 24-hour rolling window. For example:
- Compromise of AWS credentials + S3 data: unauthorized access to a compute workload → privilege escalation/persistence → data exfiltration from S3 — bundle this sequence into one finding.
- Compromise of an EKS cluster: abuse of a vulnerable container → acquisition of a privileged service-account token → access to Kubernetes secrets and AWS resources via pod identity.
And the decisive fact — an attack-sequence finding is, by its nature, all classified as severity "Critical (9.0–10.0)." The types the official docs define are the 5: AttackSequence:IAM/CompromisedCredentials, AttackSequence:S3/CompromisedData, AttackSequence:EKS/CompromisedCluster, AttackSequence:ECS/CompromisedCluster, and AttackSequence:EC2/CompromisedInstanceGroup (for a detailed reading of each type, go to the Extended Threat Detection deep-dive article).
ETD works with only the foundational data sources, but the more protection plans you add, the wider the range it can correlate. An EKS attack sequence needs EKS Protection or Runtime Monitoring, and an S3 attack sequence needs S3 Protection. That is, "the reason to enable S3 Protection is not only individual detection but to widen ETD's correlation" too.
The design implication: ETD changes the operation of "a human chasing individual findings as points" into "receiving the attack's context as a line." In the automated response later, the standard is to place
AttackSequence:*(= always Critical) as the top-priority trigger.
Note, as of 2026, GuardDuty Investigation (preview), where AI analyzes a finding and returns a MITRE ATT&CK classification, risk assessment, and next actions, has also appeared. Being a preview, don't place it at the core of the production flow, but it's worth knowing as a triage aid.
1.4 The Cluster Roadmap: A Map to the 12 Deep-Dive Articles
This pillar is the whole picture of GuardDuty production operation. Each theme is split into an independent deep-dive article. Choose by your purpose and jump (in the order of the detection lifecycle).
| Purpose / stage | Deep-dive article | For someone who |
|---|---|---|
| Correlate detection | Extended Threat Detection and attack sequences | Wants to grasp individual findings as an "attack line" |
| Turn detection into action | EventBridge automated response (SOAR) | Wants to idempotently auto-respond to findings and shrink MTTR |
| Spread it org-wide | Multi-account / Organizations control | Wants to control multiple accounts and all Regions in bulk |
| Container: audit logs | EKS Protection (Kubernetes audit logs) | Wants to detect RBAC tampering / anonymous access |
| Container: runtime | Runtime Monitoring (EKS/ECS/EC2) | Wants to see compromise inside the container (processes/comms) |
| Data: S3 | Malware Protection for S3 (standalone operation) | Wants to auto-scan/quarantine uploads |
| Data: DB / serverless | RDS / Lambda Protection | Wants to detect DB login anomalies / Lambda threats |
| Investigate | Amazon Detective investigation workflow | Wants to investigate the root cause / blast radius |
| Aggregate long-term | Security Lake aggregation・OCSF analysis | Wants long-term retention, cross SQL, SIEM integration |
| Suppress false positives | Suppression / trusted IP / threat-list tuning | Wants to lower only noise without erasing attacks |
| Optimize cost | Cost optimization・pricing (FinOps) | Wants to decompose the bill and cut waste |
| Choose technology | GuardDuty vs Security Hub / Detective / Inspector / Macie | Wants to organize which service to combine how |
2. First, Enable It on 1 Account (Terraform)
The minimal setup is surprisingly simple. The moment you enable GuardDuty, both foundational detection and Extended Threat Detection start working. A newly-enabled Region gets a 30-day free trial, during which you can check each data source's estimated usage.
# detector = そのアカウント・そのリージョンの GuardDuty 本体。
# これ1つで「基盤検知 + Extended Threat Detection」が即オンになる。
resource "aws_guardduty_detector" "this" {
enable = true
# finding の「更新」を EventBridge / S3 へ流す頻度。
# FIFTEEN_MINUTES | ONE_HOUR | SIX_HOURS(デフォルト)。
# 自動対応を組むなら 15 分を強く推奨(対応の遅延を縮める。理由は 6 章)。
finding_publishing_frequency = "FIFTEEN_MINUTES"
tags = { ManagedBy = "terraform", Purpose = "threat-detection" }
}
Protection plans are added "to match the assets" with a resource separate from aws_guardduty_detector (aws_guardduty_detector_feature). For example, to add just S3 Protection:
# S3 Protection を有効化(CloudTrail の S3 データイベントを監視)。
# ETD の S3 攻撃シーケンス相関も、これで初めて有効になる。
resource "aws_guardduty_detector_feature" "s3_data_events" {
detector_id = aws_guardduty_detector.this.id
name = "S3_DATA_EVENTS"
status = "ENABLED"
}
Region strategy: GuardDuty is a per-Region service. The official docs recommend enabling it in all Regions, including unused ones. The reason is that global-service events like IAM, STS, and CloudFront are replicated and processed to each Region, closing the path for an attacker to create unauthorized resources in a "thinly-guarded Region." Since manually enabling all Regions for one account is unrealistic, in real operation you spread it at once with the next chapter's "org-wide bulk enablement."
3. Org-Wide Bulk Control: Delegated Administrator + Auto-Enable (Multi-Account)
If you have 2 or more accounts, centralized management via AWS Organizations is the official recommendation. Rather than operating directly from the management account, designate a dedicated security account as the "delegated administrator" and aggregate and control all member accounts' findings from there (the least-privilege principle of not concentrating power in the management account).
# ── 管理(payer)アカウントで実行:GuardDuty の委任管理者を指名 ──
# security-tooling アカウントに GuardDuty 管理を委譲する。
resource "aws_guardduty_organization_admin_account" "delegate" {
admin_account_id = var.security_account_id # 例: 専用の security-tooling アカウント
}
Next, on the delegated-administrator account side, put in the setting to "automatically make all accounts in / joining the org GuardDuty targets."
# ── 委任管理者(security-tooling)アカウントで実行 ──
resource "aws_guardduty_detector" "security" {
enable = true
finding_publishing_frequency = "FIFTEEN_MINUTES"
}
# 組織メンバーの自動有効化。
# ALL = 既存 + 新規すべてのアカウントを有効化(推奨:取りこぼしを作らない)
# NEW = 今後 join するアカウントのみ
# NONE = 自動有効化しない
resource "aws_guardduty_organization_configuration" "this" {
detector_id = aws_guardduty_detector.security.id
auto_enable_organization_members = "ALL"
}
# 保護プランも「組織全体で自動有効化」できる。
# 例: S3 Protection を全アカウントで自動 ON(ETD の S3 相関を全社で効かせる)。
resource "aws_guardduty_organization_configuration_feature" "s3_org" {
detector_id = aws_guardduty_detector.security.id
name = "S3_DATA_EVENTS"
auto_enable = "ALL"
}
With these 3 resources, you create a state where "a newly-created account also enters GuardDuty's umbrella without human intervention." In an environment like a payment platform that separates accounts for production / staging / audit, this works — you can structurally erase the room for "forgot to put GuardDuty in" each time an account is added.
Choose ALL for
auto_enable_organization_members: set it toNEWonly and accounts existing at setup time can remain forever out of scope. Against security's purpose of "detect across the whole company without omission," anything butALLis a silent hole. Only when there's an account you want to intentionally exclude, manage it individually.
4. Add Protection Plans "to Match the Assets": What, When, Why
Protection plans aren't something where adding them all is good. Monitoring an undetectable target only increases cost (a YAGNI violation). Choose by matching the following against "your own assets."
| Condition to consider enabling | Plan to add | Note |
|---|---|---|
| Storing confidential data in S3 | S3 Protection | Recommended almost always. Also enables ETD's S3 correlation |
| Have an S3 bucket accepting uploads | Malware Protection for S3 | Usable standalone without GuardDuty proper |
| Operating EKS | EKS Protection (audit logs) | Detect RBAC tampering, secrets access, etc. |
| Want to see compromise inside the container/host | Runtime Monitoring | Highest cost (vCPU billing). A lightweight agent |
| Want to confirm malware on an EC2-origin finding | Malware Protection for EC2 | Agentlessly scans EBS |
| Want to detect abnormal logins to Aurora/RDS | RDS Protection | Monitors login activity |
| Want to monitor Lambda's outbound communication | Lambda Protection | Analyzes Lambda's VPC Flow Logs |
Runtime Monitoring alone is in a different class. Because it sees down to processes and system calls inside the container/host, the detection accuracy is highest, but it's charged in proportion to the protected vCPU count, and a lightweight agent (an add-on for EKS, a sidecar for ECS-Fargate, an agent for EC2) is needed. You can also have GuardDuty auto-manage the agent placement.
# Runtime Monitoring を有効化し、各環境のエージェント配置を GuardDuty に自動管理させる。
# コストが vCPU 比例で増えるため、「本当にランタイム可視性が要る環境」に絞って有効化する。
resource "aws_guardduty_detector_feature" "runtime_monitoring" {
detector_id = aws_guardduty_detector.this.id
name = "RUNTIME_MONITORING"
status = "ENABLED"
additional_configuration {
name = "EKS_ADDON_MANAGEMENT" # EKS のエージェントを自動デプロイ・更新
status = "ENABLED"
}
additional_configuration {
name = "ECS_FARGATE_AGENT_MANAGEMENT" # Fargate タスクにサイドカーを自動注入
status = "ENABLED"
}
additional_configuration {
name = "EC2_AGENT_MANAGEMENT" # EC2 のエージェントを SSM 経由で自動管理
status = "ENABLED"
}
}
Verbalizing the trade-off: protection-plan selection is the typical tug-of-war of "security coverage vs cost." I recommend spreading in the stages of "① S3 Protection on by default company-wide → ② EKS/RDS/Lambda only for accounts with the relevant assets → ③ Runtime Monitoring limited to the important production workloads." Rather than turning everything on from the start and being shocked by the bill, building only the needed layers on a foundation of ETD (free) + S3 Protection (cheap and high-impact) wins on cost-effectiveness.
5. Reading a Finding: Type, Severity, Attack Sequence
To design automated response, you need to read the structure of a finding. GuardDuty's finding type is excellent in that the meaning is encoded in the type name.
5.1 The Format of a Finding Type
ThreatPurpose : ResourceTypeAffected / ThreatFamilyName . DetectionMechanism ! Artifact
│ │ │ │ │
│ │ │ │ └─ 攻撃ツールが使う具体リソース(例: DNS)。任意
│ │ │ └─ 検知方法(.Custom=自前脅威リスト, .Reputation=評判スコア 等)
│ │ └─ 検知している脅威の中身(例: BitcoinTool, NetworkPortUnusual)
│ └─ 標的となった AWS リソース種別(EC2 / S3 / IAMUser / EKS / RDS / Lambda ...)
└─ 脅威の目的・攻撃段階(多くは MITRE ATT&CK の戦術に対応)
Example: CryptoCurrency:EC2/BitcoinTool.B!DNS conveys at a glance "the EC2 is communicating with a known Bitcoin-related domain (!DNS)." UnauthorizedAccess:IAMUser/InstanceCredentialExfiltration.OutsideAWS is "temporary credentials tied to an EC2 were used outside the issuing account (= credential exfiltration)."
The leading ThreatPurpose represents the attack stage, and many correspond to MITRE ATT&CK tactics: besides InitialAccess / Execution / Persistence / PrivilegeEscalation / DefenseEvasion / CredentialAccess / Discovery / Exfiltration / Impact, there are Backdoor / CryptoCurrency / Trojan / Recon / Stealth / Policy / UnauthorizedAccess / Pentest, etc. Just seeing the type name tells you "which stage of the attack," so it's directly usable for automated-response routing.
5.2 The Numeric Bands of Severity
A finding gets a number 1.0–10.0, divided into 4 levels. Cutting the automated-response threshold on this number is basic.
| Severity | Numeric band | Meaning | Recommended response |
|---|---|---|---|
| Critical | 9.0 – 10.0 | An attack sequence is in progress / occurred recently. Multiple resources may be compromised | Top-priority immediate triage / containment |
| High | 7.0 – 8.9 | A resource is compromised and abuse is actually in progress | Immediate containment (quarantine, key revocation) |
| Medium | 4.0 – 6.9 | Suspicious behavior deviating from baseline. Possible compromise | Investigate early and confirm whether it's legitimate use |
| Low | 1.0 – 3.9 | An attempt not reaching compromise (port scan, failed intrusion) | Immediate response unneeded, but record / grasp the trend |
AttackSequence:* (Extended Threat Detection's attack sequence) is always Critical. In automated response, the two-stage threshold "consider containment at severity ≥ 7 (High or above), immediately escalate Critical" is easy to handle in design.
5.3 Verify the "Response Path" Before Production
GuardDuty has a mechanism to intentionally generate sample findings. With the CreateSampleFindings API (or a dedicated tester script), you can fire findings of each type and verify that the EventBridge → automated-response path works as expected, without waiting for a real attack. This is the very principle of "build the verification path first."
# detector に対し、サンプル finding を生成して対応パイプラインを検証する。
# (type を指定すれば特定の finding 型だけを発火できる)
aws guardduty create-sample-findings \
--detector-id "$DETECTOR_ID" \
--finding-types "UnauthorizedAccess:EC2/MaliciousIPCaller.Custom" \
"CryptoCurrency:EC2/BitcoinTool.B!DNS"
6. Turning Detection into Action: EventBridge → Idempotent Automated Response (Most Important)
Here is the climax of this article. GuardDuty's value is decided by "post-detection response speed (MTTR)," and the foundation of automating that is Amazon EventBridge. All findings flow to EventBridge in near real time.
6.1 The Shape of a Finding That Arrives at EventBridge
{
"source": "aws.guardduty",
"detail-type": "GuardDuty Finding",
"detail": {
"id": "1ab23c...",
"type": "UnauthorizedAccess:EC2/MaliciousIPCaller.Custom",
"severity": 7.5,
"accountId": "123456789012",
"region": "ap-northeast-1",
"title": "...",
"description": "...",
"resource": { "instanceDetails": { "instanceId": "i-0abc..." } }
}
}
What matters is that detail.severity is a number. You can pick up "only High or above" with EventBridge's numeric match.
{
"source": ["aws.guardduty"],
"detail-type": ["GuardDuty Finding"],
"detail": {
"severity": [{ "numeric": [">=", 7] }]
}
}
The delay pitfall (must read): GuardDuty issues a new finding in about 5 minutes, but the frequency at which it streams the "update (recurrence)" of the same finding follows
finding_publishing_frequency, and the default is 6 hours. That is, "the first detection is fast, but follow-ups are up to 6 hours late." To raise the automated-response reaction speed, set it toFIFTEEN_MINUTESas shown in Chapter 2.
6.2 Routing Design: Notify Everything, Contain Selectively
The worst thing to do in automated response is "automatically firing a destructive action (instance stop, key deletion) at every finding." One false positive drops production. The design principles are:
- Notify (notify) broadly: enrich all High-or-above and notify Slack/SNS (speed up human triage).
- Contain (contain) narrowly: only high-confidence findings on the "type allowlist" auto-execute an idempotent, reversible containment (attaching a quarantine SG, etc.).
- Interpose a human for destructive operations: key deletion and instance termination stay, in the automatic flow, at ticketing and waiting for approval.
Build the rule and wiring in Terraform.
# High 以上(severity >= 7) の GuardDuty finding を捕捉するルール。
resource "aws_cloudwatch_event_rule" "guardduty_high" {
name = "guardduty-high-severity"
description = "Route GuardDuty findings (severity >= 7) to the responder"
event_pattern = jsonencode({
source = ["aws.guardduty"]
"detail-type" = ["GuardDuty Finding"]
detail = { severity = [{ numeric = [">=", 7] }] }
})
}
# ① 人間向け:必ず SNS(→ Slack/メール)へ通知。
resource "aws_cloudwatch_event_target" "to_sns" {
rule = aws_cloudwatch_event_rule.guardduty_high.name
target_id = "notify-sns"
arn = aws_sns_topic.security_alerts.arn
}
# ② 機械向け:封じ込めを判断する Lambda へ。
resource "aws_cloudwatch_event_target" "to_responder" {
rule = aws_cloudwatch_event_rule.guardduty_high.name
target_id = "auto-responder"
arn = aws_lambda_function.responder.arn
# 一過性の失敗に備えてリトライ&DLQ(取りこぼさない)。
retry_policy {
maximum_event_age_in_seconds = 3600
maximum_retry_attempts = 4
}
dead_letter_config { arn = aws_sqs_queue.responder_dlq.arn }
}
6.3 The Idempotent Automated-Response Lambda (Python)
EventBridge is at-least-once delivery. The Lambda can start twice on the same finding. With the same idea as preventing double charges in a payment foundation, automated response must also be idempotent — guarantee "even if quarantined twice, the state is the same as once."
"""GuardDuty finding に応答する自動対応 Lambda。
設計原則:
- 冪等: EventBridge は at-least-once。同じ finding を2回受けても副作用は1回分。
- スコープを絞る: 封じ込めは ALLOWLIST に載った型 + High 以上のみ。
- 取り消し可能: 「隔離SGのアタッチ + タグ付け」だけ。終了や鍵削除はしない。
- 可観測: 構造化ログ。機密値は出さない。
"""
from __future__ import annotations
import json
import logging
import os
from typing import Any, Final
import boto3
logger = logging.getLogger()
logger.setLevel(logging.INFO)
ec2 = boto3.client("ec2")
sns = boto3.client("sns")
# 自動封じ込めを許す finding 型(高信頼・誤検知の少ないものに限定)。
CONTAIN_ALLOWLIST: Final[frozenset[str]] = frozenset(
{
"UnauthorizedAccess:EC2/MaliciousIPCaller.Custom",
"CryptoCurrency:EC2/BitcoinTool.B!DNS",
"Backdoor:EC2/C&CActivity.B!DNS",
}
)
QUARANTINE_SG_ID: Final[str] = os.environ["QUARANTINE_SG_ID"]
ALERT_TOPIC_ARN: Final[str] = os.environ["ALERT_TOPIC_ARN"]
# 破壊的操作を避け、まず観察したい段階では DRY_RUN=true で隔離を抑止。
DRY_RUN: Final[bool] = os.environ.get("DRY_RUN", "false").lower() == "true"
def handler(event: dict[str, Any], _context: object) -> dict[str, str]:
detail = event["detail"]
finding_type: str = detail["type"]
severity: float = float(detail["severity"])
finding_id: str = detail["id"]
log = {"finding_id": finding_id, "type": finding_type, "severity": severity}
# 封じ込めの条件: 型が許可リストにあり、かつ High 以上。
instance_id = (
detail.get("resource", {}).get("instanceDetails", {}).get("instanceId")
)
should_contain = (
finding_type in CONTAIN_ALLOWLIST and severity >= 7.0 and bool(instance_id)
)
if should_contain and not DRY_RUN:
action = _quarantine_instance(instance_id) # 冪等
else:
action = "dry-run" if (should_contain and DRY_RUN) else "notify-only"
_notify(finding_type, severity, finding_id, instance_id, action)
logger.info(json.dumps({**log, "action": action}))
return {"action": action}
def _quarantine_instance(instance_id: str) -> str:
"""インスタンスを隔離SGだけにする。冪等: 既に隔離済みなら何もしない。"""
reservations = ec2.describe_instances(InstanceIds=[instance_id])["Reservations"]
instance = reservations[0]["Instances"][0]
# 冪等ガード: 隔離タグが既にあればスキップ(2回目の起動は no-op)。
tags = {t["Key"]: t["Value"] for t in instance.get("Tags", [])}
if tags.get("guardduty:quarantined") == "true":
return "already-quarantined"
# 全 ENI を隔離SGのみに付け替える(egress 遮断は SG 側で定義)。
for eni in instance["NetworkInterfaces"]:
ec2.modify_network_interface_attribute(
NetworkInterfaceId=eni["NetworkInterfaceId"],
Groups=[QUARANTINE_SG_ID],
)
ec2.create_tags(
Resources=[instance_id],
Tags=[{"Key": "guardduty:quarantined", "Value": "true"}],
)
return "quarantined"
def _notify(
finding_type: str, severity: float, finding_id: str, instance_id: str | None, action: str
) -> None:
region = os.environ.get("AWS_REGION", "ap-northeast-1")
console = (
f"https://{region}.console.aws.amazon.com/guardduty/home"
f"?region={region}#/findings?fId={finding_id}"
)
sns.publish(
TopicArn=ALERT_TOPIC_ARN,
Subject=f"[GuardDuty][{severity}] {finding_type}",
Message="\n".join(
[
f"type: {finding_type}",
f"severity: {severity}",
f"instance: {instance_id or 'n/a'}",
f"action: {action}",
f"console: {console}",
]
),
)
Let me make this code's design judgments explicit.
- Idempotent: guard with the quarantine tag, and even if redelivered, the 2nd time is a no-op via
already-quarantined. Without this, at-least-once delivery runs "the same containment twice," producing wasted API calls and contention. - Scope-narrowed: the double condition
CONTAIN_ALLOWLIST×severity >= 7.0. It doesn't fire containment on false-positive-heavy types or Low/Medium. - Reversible: all it does is "swap to the quarantine SG + tag." It doesn't terminate the instance or delete keys (if it later turns out to be legitimate use, you can revert).
- Staged rollout: with
DRY_RUN=true, it has a "decide but don't act" mode, letting you verify only the decision logic against real traffic before production. - Least privilege: limit this Lambda's execution role to
ec2:ModifyNetworkInterfaceAttribute/ec2:CreateTags/ec2:DescribeInstances/sns:Publish, and narrowResourceas much as possible.
Testability: the decision logic (computing
should_contain) can be carved into a side-effect-free pure function. Swap theboto3client and you can cover the decision in unit tests with sample finding JSON as input. Combine it with 5.3'screate-sample-findingsand you can verify the response path in both code and infrastructure.
7. Suppressing False Positives and Noise: Suppression Rule / Trusted IP / Threat List
The first thing you face in GuardDuty operation is the noise of "legitimate operations producing findings." A vulnerability scanner's periodic run, admin access from a specific IP, etc. There are 3 tools for tuning.
- Suppression Rule: auto-archive findings matching a filter condition, removing them from the dashboard and notifications. Use it for known noise like "a
Recon:*by our own vulnerability scanner." - Trusted IP list: traffic from an IP placed here doesn't generate findings. For your own fixed IPs, VPN, monitoring services, etc.
- Threat list: register IPs/domains you've judged "this is malicious" yourself. A match produces a finding with the
.Customdetection mechanism (5.1's!Custom).
# 信頼 IP リスト(S3 上の txt)を登録し、即アクティベートする例。
aws guardduty create-ip-set \
--detector-id "$DETECTOR_ID" \
--name "corporate-egress" \
--format TXT \
--location "s3://my-sec-config/trusted-ips.txt" \
--activate
The Suppression Rule trap (interaction with Extended Threat Detection): ETD does not target archived findings (including those auto-archived by a suppression rule) for correlation. That is, "erasing a weak signal that looks like noise alone with suppression can also prevent the detection of the attack sequence (Critical) it was originally supposed to bundle." Limit suppression to "noise you're certain is harmless," and silence signals that could become an attack's context not with suppression but with notification-side filters (the severity threshold or type routing of Chapter 6) — this distinction works in production.
8. Production Operation: Observability, Cost, Not Over-Trusting
8.1 Aggregate Findings in One Place
Findings from multiple accounts and Regions are auto-aggregated in the delegated-administrator account. Further, flow them to AWS Security Hub CSPM and you can prioritize cross-cuttingly alongside Macie and third-party detection. If deep investigation is needed, Amazon Detective (graph visualization via GuardDuty integration); for long-term retention/analysis, set up finding export to S3. "Detect with GuardDuty → aggregate with Security Hub → deep-dive with Detective" is the royal three-piece set.
8.2 The Cost Model (Design Proportional to Assets)
GuardDuty's pricing is usage-based for what you enable, differing by Region. Pin down the concept and the budget becomes readable.
| Billing target | Billing unit |
|---|---|
| Foundation: CloudTrail management events | Per 1M events |
| Foundation: VPC Flow Logs / DNS logs | Per GB (tapering with volume) |
| S3 Protection | Per 1M S3 data events |
| EKS Protection | Per 1M audit logs |
| Runtime Monitoring | Per protected vCPU hour (tends to be highest) |
| Malware Protection for EC2 | Per scanned GB |
| Malware Protection for S3 | Scanned GB + objects evaluated |
| RDS Protection | Per vCPU (ACU for Aurora Serverless) |
| Lambda Protection | Per GB as VPC Flow Logs analysis |
A newly-enabled Region gets a 30-day free trial, letting you grasp the estimated bill at production volume in advance. From the delegated-administrator account, you can also see per-member-account estimated usage. The iron rule of cost optimization is the repeat of Chapter 4 — on a foundation of ETD (free) + S3 Protection, limit Runtime Monitoring to important workloads. "Just turn it all on" tends to result in paying vCPU billing even for targets with thin detection value.
8.3 Don't Over-Trust GuardDuty — Re-Confirming Defense in Depth
Finally, back to the initial mental model. GuardDuty is detection, not prevention. Even with GuardDuty:
- The entrance defense needs WAF (L7 filtering).
- To reduce credential leakage itself, you need OIDC keyless CI/CD and least-privilege IAM.
- Data-layer protection (encryption, VPC endpoints, fine-grained access control) is separately needed.
GuardDuty, in this defense in depth, handles "the layer that quickly notifies of signs of compromise and triggers a response." Quickly finding and quickly containing an attack that slipped past the prevention layers — that is the correct role to give GuardDuty.
9. Summary: A GuardDuty Production Cheat Sheet
A quick reference for when you're lost.
- What kind of service: GuardDuty is detection, not prevention. Its value is decided by "detection accuracy × post-detection response speed (MTTR)."
- Foundational detection: enable = immediately on, no extra cost. Analyzes an independent and duplicated stream of CloudTrail management events / VPC Flow Logs / DNS (you don't enable, store, or pay for the logs; no agent). DNS is only when using the AWS resolver.
- Protection plans: add to match the assets (YAGNI). S3 Protection is recommended almost always; Runtime Monitoring is highest at vCPU billing, so limit it to important workloads. Malware Protection for S3 is usable standalone too.
- Extended Threat Detection: free, automatic. Correlates weak signals within a 24-hour window and bundles a multi-stage attack into one
AttackSequence:*finding (always Critical 9.0–10.0). S3/EKS correlation needs the corresponding protection plan. - Org control: with a delegated administrator +
auto_enable_organization_members = "ALL", enable all accounts and all Regions without omission. For global events, enable unused Regions too. - Findings: meaning is encoded in the type name
ThreatPurpose:Resource/Family.Mechanism!Artifact. Severity is a number (Critical 9.0–10.0 / High 7.0–8.9 / Medium 4.0–6.9 / Low 1.0–3.9). Pre-verify the response path withcreate-sample-findings. - Automated response (most important): pick up High-or-above with EventBridge's
detail.severitynumeric match. Notify broadly, contain narrowly with a type allowlist, interpose a human for destructive operations. EventBridge is at-least-once → idempotency mandatory. To shrink the update delay,finding_publishing_frequency = FIFTEEN_MINUTES. - Tuning: a Suppression Rule only for "noise you're certain is harmless." Don't erase, with suppression, signals that become material for an attack sequence (ETD doesn't see archived ones).
- Don't over-trust: GuardDuty is one layer of defense in depth. It doesn't replace WAF, least-privilege IAM, keyless auth, or encryption.
GuardDuty is not "a box that protects you if you enable it" but a service whose value changes 10× by how you design "detection → correlation → action." The biggest leverage lies not in detection itself but in the plumbing that turns findings into idempotent, safe automated response.
On a multi-account serverless payment platform, I cross-implemented the IAM, observability, and DR of a foundation handling actual money, carbon credits, and local currencies, guaranteeing "correctness" not with operational attentiveness but with the structure of code and idempotency. I design GuardDuty's introduction with the same philosophy — ① erase omissions structurally with org-wide enablement, ② stack protection plans at minimal cost to match the assets, ③ shrink MTTR with EventBridge → idempotent, scope-narrowed, reversible automated response. I don't let detection end at a "red dashboard" but build it through to a response mechanism that becomes operational.
"How to design GuardDuty for our AWS, how far to entrust automated response, and how to keep cost down" — from protection-plan selection through Terraform implementation, EventBridge automated response, org control, and false-positive tuning, I can accompany you, fast and safe with one person × generative AI (Claude Code). Feel free to consult me even from the requirements-organizing stage.
References (Official Documentation)
- What is Amazon GuardDuty? — service definition, feature list (foundational detection / ETD / protection plans / Investigation preview)
- GuardDuty foundational data sources — the "independent and duplicated stream" of CloudTrail management events / VPC Flow Logs / DNS and global-event processing
- GuardDuty Extended Threat Detection — signal correlation, the 24-hour window, attack sequences (always Critical), the relationship with protection plans
- GuardDuty finding format —
ThreatPurpose:Resource/Family.Mechanism!Artifactand the ThreatPurpose list (MITRE ATT&CK correspondence) - Severity levels of GuardDuty findings — the numeric bands of Critical/High/Medium/Low (1.0–10.0)
- Creating custom responses to GuardDuty findings with Amazon EventBridge — finding event structure, publishing frequency, automated response
- Managing multiple accounts with AWS Organizations — delegated administrator, auto-enable (ALL/NEW)
- Suppression rules — suppression rules and the interaction with Extended Threat Detection
- Amazon GuardDuty pricing — the breakdown of usage-based billing, the 30-day free trial, per-Region