# Auto-Scanning Uploaded Files with GuardDuty Malware Protection for S3: Standalone Operation, Scan-Result Gating, and the Difference from S3 Protection in Real Code

> A production design guide for auto-malware-scanning uploaded S3 objects with GuardDuty Malware Protection for S3. Explained with real Terraform / Python / bucket-policy code: the difference from the easily-confused 'S3 Protection (CloudTrail data-event monitoring),' the standalone operation mode used without GuardDuty itself (no detector ID = no finding generated), scan-result tags (GuardDutyMalwareScanStatus) and EventBridge events, and a secure upload pipeline that promotes only NO_THREATS_FOUND to a clean bucket and seals off reads with tag-based access control (TBAC).

- Published: 2026-06-27
- Author: 友田 陽大
- Tags: セキュリティ, AWS, GuardDuty, S3, マルウェア対策
- URL: https://tomodahinata.com/en/blog/aws-guardduty-malware-protection-s3-standalone-scanning-guide
- Category: Amazon GuardDuty in production
- Pillar guide: https://tomodahinata.com/en/blog/aws-guardduty-threat-detection-multi-account-terraform-eventbridge-guide

## Key points

- 'S3 Protection' and 'Malware Protection for S3' are different things. The former monitors CloudTrail's S3 data events to detect suspicious access / data exfiltration (GuardDuty required), the latter malware-scans newly uploaded objects. Not confusing them is the starting point of design
- Malware Protection for S3 can be used 'standalone' without enabling GuardDuty itself. A standalone-mode account has no detector ID, so even if it detects malware no GuardDuty finding is generated — the result appears only on EventBridge's default bus + CloudWatch + (optionally) object tags
- The scan result can be written to the object tag GuardDutyMalwareScanStatus. The values are the 5 of NO_THREATS_FOUND / THREATS_FOUND / UNSUPPORTED / ACCESS_DENIED / FAILED (the exact official set). EventBridge is at-least-once delivery, so make the result-processing handler idempotent
- A high-value application: upload to a landing bucket (protection enabled) → EventBridge scan result → an idempotent Lambda promotes only NO_THREATS_FOUND to a clean bucket and quarantines THREATS_FOUND. Downstream physically can't read objects without the clean tag via a tag-based access control (TBAC) DENY policy
- Limits and pricing (verify the latest officially): max object size 100 GB, 25 protected buckets/account/region, own account & same region only. Pricing is usage-based on 'scanned GB + objects evaluated,' with a free tier of 1,000 requests + 1 GB per month (on-demand and tagging are not in the free tier)

---

"The files users upload — **who** guarantees they're not malware?" — this is the first question I throw at a place where I'm consulted about the security of a SaaS with an upload feature.

Usually what comes back is silence, or "we reject by extension," "we check the MIME on the front." But that's a **name-tag check at the entrance**, not a scan of the contents. An attacker can fake both the extension and the Content-Type. An uploaded PDF is actually an executable, and another user downloads and opens it — then your app becomes **a distribution route for malware.** And what's frightening is that it happens **through the legitimate user flow.**

This article is an implementation guide for designing and implementing, at **production quality**, a mechanism with **GuardDuty Malware Protection for S3** of "auto-malware-scanning objects uploaded to S3 and flowing only those confirmed clean downstream." As the subject matter, I'll weave in my experience implementing IAM, observability, and DR across a [serverless payment platform](/case-studies/payment-platform-reliability) on multi-account AWS, and **ensuring with idempotency, on a platform handling actual money, that "the same event received twice has a one-time side effect"** — that idea is exactly the same as the design of safely handling at-least-once scan-result events.

> **The rule of this article**: The specs, tag values, limits, pricing, and EventBridge event structure are based on the **AWS official documentation (as of June 2026)**. Because **limits, pricing, and supported Regions get revised**, always confirm the latest values (the quotas / pricing pages) officially before going to production. And one more — **GuardDuty Malware Protection for S3 is not a "full AV / EDR."** It's **one layer** that detects known and some unknown malware with a scan engine, and it doesn't replace input validation, least-privilege IAM, encryption, or WAF. Start the automated processing that receives a scan result with ones that satisfy **"idempotent, scope-narrowed, reversible."**

---

## 0. Mental model: this is "a quarantine for uploads," not "a constant surveillance camera"

Before starting the design, let me separate, in one line, the **two easily-confused S3 security features.** Without fixing this first, the requirement and the feature pass each other by.

> **S3 Protection = monitor "access (API operations)" to S3 and detect suspicious behavior (CloudTrail data events). Malware Protection for S3 = malware-scan the "contents of files" uploaded to S3. The former is "who did what," the latter is "what came in."**

From here, three consequences emerge. These are the foundation of the design decisions.

1. **What they look at differs.** **S3 Protection**, included in [GuardDuty threat detection](/blog/aws-guardduty-threat-detection-multi-account-terraform-eventbridge-guide), analyzes CloudTrail's S3 **data events** (`GetObject` / `PutObject` / `ListObjects` / `DeleteObject`, etc.) and detects "**suspicious access** or data exfiltration using legitimate credentials." On the other hand, **Malware Protection for S3** downloads the **contents of a newly uploaded object** and **malware-scans** it. The former looks at **behavior**, the latter at **contents** — different things (I'll settle it with a table in Section 1).
2. **Malware Protection for S3 works without GuardDuty itself.** This is the biggest feature of this article's lead feature. Without enabling the GuardDuty service, you can use **just this feature standalone (independent).** But in standalone mode, **the account has no detector ID**, so even if it detects malware **no GuardDuty finding is generated.** The result appears only on EventBridge's default bus, CloudWatch, and (optionally) object tags (Section 2).
3. **Detection alone isn't safe. You need the plumbing of "quarantine → isolate → promote."** Even if you scan and learn "it's malware," it's meaningless if that contaminated object is **still in a place readable downstream.** The climax of this article is the plumbing that **physically makes a contaminated object unreadable downstream and flows only clean ones** (Section 5's secure upload pipeline + TBAC).

Grasp these three points and you'll see what to do is the three of **"① don't mix up the 2 features → ② correctly enable the protection plan (standalone or with GuardDuty) → ③ turn scan results into 'quarantine / promote' with idempotent plumbing."** Let's build them in order.

---

## 1. The settling table to not mix them up: S3 Protection vs. Malware Protection for S3

Because the names are similar, the most common accident in the field is **"a mismatch between what you want to do and the feature you enabled."** First, let me settle it head-on.

| Aspect | **S3 Protection** | **Malware Protection for S3** |
| --- | --- | --- |
| What it looks at | **API access** to S3 (CloudTrail **data events**) | The **contents of an uploaded object** (malware) |
| Threat it detects | Suspicious access, **data exfiltration / destruction** (misuse of leaked credentials, etc.) | **Malware** contained in an uploaded file |
| Trigger | **Operations** like `GetObject` / `PutObject` / `ListObjects` / `DeleteObject` | A **new upload** of an object (or a new version) |
| GuardDuty itself | **Required** (a feature of the protection plan) | **OK even without it** (standalone-operable) |
| Feature name / resource | feature `S3_DATA_EVENTS` (a detector feature) | Malware Protection plan (a per-bucket resource) |
| Result output | A **GuardDuty finding** | A finding (when with GuardDuty) / EventBridge + CloudWatch + tags |
| Billing unit | S3 data-event volume | **Scanned GB + objects evaluated** |

Borrowing the official wording, S3 Protection *"helps you detect potential security risks for data, such as data exfiltration and destruction"* — that is, **detecting threats to data (exfiltration / destruction).** Malware Protection for S3 *"helps you detect potential presence of malware by scanning newly uploaded objects"* — **malware-scanning new uploads.**

> **A design guideline**: these two **don't compete; they complement.** For a bucket that "accepts uploads" and "holds sensitive data," it's ideal to enable **both.** Malware Protection for S3 watches "the bad things coming in," and S3 Protection watches "exfiltration mixed into legitimate access." Note that S3 Protection is a GuardDuty protection plan, so for how to enable it, see the `S3_DATA_EVENTS` feature in the pillar article. **From here on, this article narrows to Malware Protection for S3.**

> **Beware a third, even more confusing existence**: there's another feature, "Malware Protection for **EC2**." This agentlessly scans the **EBS volumes** attached to EC2/containers, and its target is completely different from this article's **for S3.** Talk with just "Malware Protection" and the three get crossed, so always say **for S3 / for EC2** to prevent accidents.

---

## 2. How Malware Protection for S3 works: the decisive difference between standalone and "with GuardDuty"

### 2.1 The scan-on-upload model

The mechanism is simple. **When an object is newly uploaded to a bucket with protection enabled (officially a "protected bucket") (or a new version of an existing object is uploaded), GuardDuty automatically starts a malware scan.**

What triggers the scan is S3's **Object Created**-family events — `PutObject` / `POST Object` / `CopyObject` / `CompleteMultipartUpload`. GuardDuty **downloads the target object via AWS PrivateLink, and decrypts, reads, and scans it in an isolated environment (an internet-disconnected VPC) in the same Region.** The temporary copy during the scan is KMS-encrypted, and **the downloaded copy is deleted after the scan completes.** That is, **your data never leaves for the scan, and only the result metadata remains.**

### 2.2 The two enablement approaches — the core of this article

Malware Protection for S3 has **two ways** to enable it. This difference divides the operational design.

| | **(a) Use it with GuardDuty** | **(b) Use it standalone (independent)** |
| --- | --- | --- |
| The GuardDuty service | **Enabled** (a detector ID exists) | **OK disabled** (no detector ID) |
| A finding on malware detection | **A GuardDuty finding is generated** | **No finding is generated** |
| Receiving the result | A finding (+ export to S3/EventBridge) | Only the **EventBridge default bus + CloudWatch + (optional) object tags** |
| Correlation with existing GuardDuty detections | Possible (lines up with ETD and other detections) | Not possible (an isolated single feature) |
| Suited case | You already operate GuardDuty company-wide | A minimal configuration of "just want to scan uploads" |

Accurately grasp the **decisive property of standalone mode** the official docs write clearly:

> *"When you enable Malware Protection for S3 independently in an account, that account will **not** have an associated detector ID. ... when an S3 malware scan detects the presence of malware, **no GuardDuty finding will get generated** in your AWS account because all GuardDuty findings are associated with a detector ID."*

That is — **in standalone mode, "even if it finds malware, nothing appears on the GuardDuty dashboard."** This is not a defect but a **design.** Because a GuardDuty finding is tied to a detector ID, no detector means no finding. Instead, the result appears on the **EventBridge default event bus**, **CloudWatch metrics**, and, **if enabled, object tags.**

> **So in a standalone-mode design, EventBridge and tags are "the only exit for detection."** Build alerts on the premise that "a finding will come," and in standalone mode **the alerts never ring.** The pipeline described later is built around this EventBridge event.

### 2.3 Trade-off: which to choose

- **If you already operate GuardDuty in your organization → (a) with GuardDuty.** It rides naturally onto the existing incident-response plumbing ([EventBridge → automated response](/blog/aws-guardduty-eventbridge-automated-remediation-incident-response-guide)) as a finding. The value of malware detection lining up on the **same playing field** as other threat signals is large.
- **If "I want to scan just this upload bucket" / "I don't need all of GuardDuty right now" → (b) standalone.** You can introduce the feature pinpoint, at minimal cost and least privilege. Enable GuardDuty itself later and findings will appear too.

> **My recommendation**: first **start small in standalone mode**, receive scan results with EventBridge, and build the plumbing. The company-wide rollout of GuardDuty itself is a big decision in its own right, so don't take "upload quarantine," a single requirement, hostage to it. Even if you later promote to (a), **the plumbing built around EventBridge can be reused as-is** (the finding route is just added).

---

## 3. The enablement components: the Malware Protection plan, IAM role, prefixes, limits

### 3.1 What you can and can't do (fix the constraints first)

- **Only your own account's buckets.** Even a delegated GuardDuty administrator **can't enable it on a member account's bucket** (it's closed within the same account).
- **Same Region only.** A cross-Region bucket is out of scope.
- A per-bucket **"Malware Protection plan" resource** is created, with a unique plan ID. GuardDuty **auto-creates and manages an EventBridge managed rule** named `DO-NOT-DELETE-AmazonGuardDutyMalwareProtectionS3*` (don't delete it by hand).
- **You can scope by prefix.** Rather than the whole bucket, you can target only specific **object prefixes (up to 5)** for scanning. Effective for a design of "the upload receiver is only under `uploads/`."
- **Supports KMS-encrypted buckets** (decrypted inside the scan environment). But **objects with SSE-C (customer-provided keys) can't be scanned** (the later `ACCESS_DENIED` reason `SSE_C_ENCRYPTED_OBJECT`).

### 3.2 The IAM role: least privilege making GuardDuty act "on your behalf"

Malware Protection for S3 requires an **IAM role for GuardDuty to run scans in your account.** The permissions this role needs are roughly the next 3 categories:

1. **Receive notification of new uploads** (via the EventBridge managed rule)
2. **Read and decrypt the target object** (`s3:GetObject` + `kms:Decrypt` if needed)
3. **(Optional) tag after the scan** (`s3:PutObjectTagging`)

The role's **trust policy** lets the GuardDuty Malware Protection service principal `sts:AssumeRole`. It's safe to follow the official IAM policy template, and the recommended operation is to **add target bucket names to the same role when adding buckets.** Per the principle of least privilege, narrow the `Resource` to "only this bucket, this prefix" (the same shape as the [least-privilege thinking at the data layer](/blog/dynamodb-security-iam-fine-grained-access-control-encryption-vpc-endpoint-guide)).

### 3.3 Limits (the exact official values, verify the latest)

| Limit | Default value | Adjustable | Note |
| --- | --- | --- | --- |
| Max S3 object size | **100 GB** | No | If you need a larger target, consult AWS Support |
| Extracted file count | **100,000** | No | The max number of files expandable/analyzable in an archive |
| Max nesting depth | **100** | No | The max levels of archive nesting |
| Max protected buckets | **25** | No | Per **account × Region** |

Exceed these and the scan is **skipped**, and the result becomes `UNSUPPORTED` (example reasons: `OBJECT_SIZE_LIMIT_EXCEEDED` / `EXTRACTED_FILE_LIMIT_EXCEEDED` / `EXTRACTED_LEVEL_LIMIT_EXCEEDED` / `EXTRACTION_RATIO_LIMIT_EXCEEDED`). Always handle the point that **"couldn't scan ≠ safe"** in the later plumbing (Section 5).

### 3.4 On-demand scanning: for existing objects / re-scanning

Automatic scanning runs against **new uploads**, but for **objects that existed before you enabled protection** or to **re-scan something already scanned**, use **on-demand scanning.**

```bash
# 既存オブジェクト（最新バージョン）をオンデマンドでスキャン。
# 事前条件: 対象バケットで Malware Protection for S3 が有効 + 呼び出し元に
#           AWS マネージドポリシー AmazonGuardDutyFullAccess_v2 が付与されていること。
aws guardduty send-object-malware-scan \
  --s3-object '{"Bucket": "my-upload-landing", "Key": "uploads/legacy-file.pdf"}'

# 特定バージョンを指定してスキャンする場合は VersionId を渡す。
aws guardduty send-object-malware-scan \
  --s3-object '{"Bucket": "my-upload-landing", "Key": "uploads/legacy-file.pdf", "VersionId": "d41d8cd9...EXAMPLE"}'
```

> **Cautions**: on-demand scanning **overrides the plan's prefix setting** (you can target outside the prefix), and **the limits and pricing apply the same** as automatic scanning. And important — **on-demand is not in the free tier.** "A success response ≠ scan complete" but **only accepted**, so always confirm the result with EventBridge / tags / CloudWatch.

---

## 4. Reading the scan result: tags, status values, EventBridge events

To build automated processing, you need to accurately read the **structure of the result.** There are 3 exits — **object tags**, **EventBridge events**, and **CloudWatch metrics.**

### 4.1 Scan-result tags (optional, enabling them is mandatory "before upload")

Enable tagging and after the scan GuardDuty attaches a **predefined tag** to the object. The key and value are fixed officially:

```text
Key:    GuardDutyMalwareScanStatus
Value:  NO_THREATS_FOUND | THREATS_FOUND | UNSUPPORTED | ACCESS_DENIED | FAILED
```

| Result value | Meaning | Scan status |
| --- | --- | --- |
| `NO_THREATS_FOUND` | No threats detected | Completed |
| `THREATS_FOUND` | A threat detected | Completed |
| `UNSUPPORTED` | Unscannable (password-protected, size/compression-ratio exceeded, unsupported S3 feature, etc.) | Skipped |
| `ACCESS_DENIED` | Can't access the object (IAM role permissions, SSE-C, etc.) | Skipped |
| `FAILED` | Couldn't scan due to an internal error | Failed |

> **A fatal pitfall**: unless tagging is **enabled "before" the object is uploaded**, enabling it later **won't tag that object.** So the iron rule is the order "create the bucket → enable protection + tagging → then start accepting uploads." Also, the max tags attachable to an object is 10, and **if the slots are full GuardDuty can't tag it**, and instead a "post-scan tag failure" event appears on EventBridge.

### 4.2 The EventBridge scan-result event (the lead of automated processing)

GuardDuty **always publishes the scan result to the default EventBridge event bus** (both standalone and with GuardDuty). This is the entrance to automated processing. The `detail-type` is **`GuardDuty Malware Protection Object Scan Result`**, and the `source` is `aws.guardduty`.

The `NO_THREATS_FOUND` event (official schema, excerpt):

```json
{
  "detail-type": "GuardDuty Malware Protection Object Scan Result",
  "source": "aws.guardduty",
  "account": "111122223333",
  "region": "us-east-1",
  "resources": ["arn:aws:guardduty:us-east-1:111122223333:malware-protection-plan/b4c7f464ab3a4EXAMPLE"],
  "detail": {
    "schemaVersion": "1.0",
    "scanStatus": "COMPLETED",
    "resourceType": "S3_OBJECT",
    "s3ObjectDetails": {
      "bucketName": "amzn-s3-demo-bucket",
      "objectKey": "uploads/report.pdf",
      "eTag": "ASIAI44QH8DHBEXAMPLE",
      "versionId": "d41d8cd98f00b204e9800998eEXAMPLE",
      "s3Throttled": false
    },
    "scanResultDetails": {
      "scanResultStatus": "NO_THREATS_FOUND",
      "threats": null,
      "statusReasons": null
    }
  }
}
```

For `THREATS_FOUND`, `scanResultDetails.threats` holds the detection name (by default it reports **the first detected one**, and `scanStatus` is `COMPLETED`):

```json
{
  "detail": {
    "scanStatus": "COMPLETED",
    "s3ObjectDetails": { "bucketName": "amzn-s3-demo-bucket", "objectKey": "uploads/evil.bin", "versionId": "..." },
    "scanResultDetails": {
      "scanResultStatus": "THREATS_FOUND",
      "threats": [ { "name": "EICAR-Test-File (not a virus)" } ],
      "statusReasons": null
    }
  }
}
```

When the scan was skipped, it's `scanStatus: "SKIPPED"`, with `scanResultStatus` being `UNSUPPORTED` or `ACCESS_DENIED`, and further `statusReasons` holding the concrete reason (`PASSWORD_PROTECTED`, `SSE_C_ENCRYPTED_OBJECT`, `OBJECT_SIZE_LIMIT_EXCEEDED`, etc.).

> **At-least-once (must read)**: the official docs state clearly — *"GuardDuty uses at-least-once delivery, which means you might receive multiple scan results for the same object. We recommend designing your applications to handle duplicate results."* That is, **the scan result of the same object can arrive multiple times.** With the same thinking as preventing double charges on a payment platform, **the result handler must be idempotent** (Section 5's Lambda builds that out). Note that billing is **once per object** even with duplicates.

### 4.3 A note on the status model: status (scanStatus) and result (scanResultStatus) are different

Let me make it explicit because it's easy to confuse. **`scanStatus`** is "the scan's status" (`COMPLETED` / `SKIPPED` / `FAILED`), and **`scanResultStatus`** is "the result" (the 5 values above). Even with `COMPLETED`, the result is one of `THREATS_FOUND` or `NO_THREATS_FOUND`. **"`SKIPPED` / `FAILED` / `UNSUPPORTED` / `ACCESS_DENIED` do not mean 'safe'"** — it just couldn't be scanned. The iron rule of design is **"explicitly allow (allowlist) only `NO_THREATS_FOUND`, and fall everything else to the isolation side."** This becomes the core of the next chapter's plumbing.

---

## 5. A high-value application: a secure upload pipeline (landing → clean / quarantine)

This is the climax of this article. Scanning alone isn't safe. **"Keep an object that might be contaminated in a state downstream absolutely can't read, and flow only those confirmed clean"** — let's build that plumbing.

### 5.1 The big picture: 3 buckets + event-driven promotion

```text
ユーザー
  │  アップロード（署名付き URL など）
  ▼
landing バケット（Malware Protection for S3 有効・タグ付け ON）
  │  ・下流は読めない（TBAC: 清浄タグが無いオブジェクトの GetObject を DENY）
  │  ・GuardDuty が自動スキャン
  ▼
EventBridge（detail-type = "GuardDuty Malware Protection Object Scan Result"）
  ▼
スキャン結果 Lambda（冪等）
  ├─ NO_THREATS_FOUND → clean バケットへ「昇格」（コピー）。下流はここだけ読む
  ├─ THREATS_FOUND    → quarantine バケットへ隔離 + セキュリティ通知（人間へ）
  └─ それ以外(UNSUPPORTED/ACCESS_DENIED/FAILED/SKIPPED) → quarantine + 要調査通知
                                                          （「スキャン不可 ≠ 安全」）
下流の消費者
  └─ clean バケットだけを読む（landing は TBAC で物理的に読めない）
```

The crux of the design is **a 2-stage defense**:

1. **EventBridge-driven promotion** — move only clean ones to `clean` (downstream sees only `clean`).
2. **TBAC (tag-based access control)** — even if someone tries to read `landing`, **DENY the `GetObject` of objects without the `NO_THREATS_FOUND` tag with the S3 bucket policy.** Because it's sealed at the **storage-layer boundary**, not app logic, a code bug can't break it (the same idea as [making least privilege effective at the data layer](/blog/dynamodb-security-iam-fine-grained-access-control-encryption-vpc-endpoint-guide)).

### 5.2 The TBAC bucket policy: don't let it be read without the clean tag

A policy for the `landing` bucket of **"can't be read unless `NO_THREATS_FOUND`,"** following the official template. Replace `{{...}}` with your values.

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "NoReadUnlessClean",
      "Effect": "Deny",
      "NotPrincipal": {
        "AWS": [
          "arn:aws:sts::555555555555:assumed-role/IAM-role-name/GuardDutyMalwareProtection",
          "arn:aws:iam::555555555555:role/IAM-role-name"
        ]
      },
      "Action": ["s3:GetObject", "s3:GetObjectVersion"],
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*",
      "Condition": {
        "StringNotEquals": {
          "s3:ExistingObjectTag/GuardDutyMalwareScanStatus": "NO_THREATS_FOUND"
        }
      }
    },
    {
      "Sid": "OnlyGuardDutyCanTagScanStatus",
      "Effect": "Deny",
      "NotPrincipal": {
        "AWS": [
          "arn:aws:sts::555555555555:assumed-role/IAM-role-name/GuardDutyMalwareProtection",
          "arn:aws:iam::555555555555:role/IAM-role-name"
        ]
      },
      "Action": "s3:PutObjectTagging",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*",
      "Condition": {
        "ForAnyValue:StringEquals": {
          "s3:RequestObjectTagKeys": "GuardDutyMalwareScanStatus"
        }
      }
    }
  ]
}
```

Reading this policy:

- **`NoReadUnlessClean`**: **DENY** the read of objects whose `s3:ExistingObjectTag/GuardDutyMalwareScanStatus` is **not** `NO_THREATS_FOUND`. An object that isn't tagged yet (= scan not complete) is naturally not `NO_THREATS_FOUND` either, so it **can't be read.** It guarantees, **at the storage layer**, "no one can read until the scan finishes and it's confirmed clean."
- **Exclude only the GuardDuty role with `NotPrincipal`**: exclude the scan-execution role (and the `.../GuardDutyMalwareProtection` GuardDuty assumes) for reading and tagging.
- **`OnlyGuardDutyCanTagScanStatus`**: a DENY making **only GuardDuty able to attach** the `GuardDutyMalwareScanStatus` tag. Without this, someone could **manually attach `NO_THREATS_FOUND`** to a contaminated object and slip past the gate.

> **Additional defense in an organization**: if you use AWS Organizations, enforce company-wide with an **SCP that "the `GuardDutyMalwareScanStatus` tag can't be tampered with"** (the official docs guide you to use the EC2 example replaced with `s3`). **If a tag becomes the basis of trust, preventing tag tampering is the prerequisite.** Skip this and TBAC becomes "an unlocked safe."

### 5.3 The scan-result Lambda (Python, idempotent)

A Lambda that receives the EventBridge scan result and **promotes only `NO_THREATS_FOUND` to `clean`, and isolates everything else to `quarantine`.** Made idempotent on the premise of **at-least-once delivery.**

```python
"""GuardDuty Malware Protection for S3 のスキャン結果に応答する Lambda。

設計原則:
  - 冪等: EventBridge は at-least-once。同じオブジェクトの結果を2回受けても副作用は1回分。
  - allowlist で安全側に倒す: 'NO_THREATS_FOUND' のときだけ昇格。それ以外は全て隔離。
    （UNSUPPORTED/ACCESS_DENIED/FAILED/SKIPPED は『スキャン不可』であって『安全』ではない）
  - 取り消し可能: landing からは消さずコピーで昇格／隔離。誤判定でも原本が残る。
  - 可観測: 構造化ログ。脅威名は通知に載せるが、オブジェクトの中身は読まない・出さない。
"""
from __future__ import annotations

import json
import logging
import os
from typing import Any, Final
from urllib.parse import unquote_plus

import boto3
from botocore.exceptions import ClientError

logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3 = boto3.client("s3")
sns = boto3.client("sns")

CLEAN_BUCKET: Final[str] = os.environ["CLEAN_BUCKET"]
QUARANTINE_BUCKET: Final[str] = os.environ["QUARANTINE_BUCKET"]
ALERT_TOPIC_ARN: Final[str] = os.environ["ALERT_TOPIC_ARN"]

# 昇格を許す唯一の結果値。これ以外は全部隔離側へ倒す（fail-closed）。
CLEAN_STATUS: Final[str] = "NO_THREATS_FOUND"


def handler(event: dict[str, Any], _context: object) -> dict[str, str]:
    detail = event["detail"]
    obj = detail["s3ObjectDetails"]
    src_bucket: str = obj["bucketName"]
    # S3 のイベントキーは URL エンコードされ得るのでデコードする。
    key: str = unquote_plus(obj["objectKey"])
    version_id: str | None = obj.get("versionId")
    result: str = detail.get("scanResultDetails", {}).get("scanResultStatus", "FAILED")
    threats = detail.get("scanResultDetails", {}).get("threats")

    log = {"bucket": src_bucket, "key": key, "version": version_id, "result": result}

    if result == CLEAN_STATUS:
        dest = CLEAN_BUCKET
        disposition = "promoted"
    else:
        dest = QUARANTINE_BUCKET
        disposition = "quarantined"

    # ── 昇格／隔離（冪等）──
    # 同一バージョンを宛先キーに含めることで、再配信されても同じ宛先に上書きコピー＝
    # 何度実行しても結果は同じ（at-least-once に対する冪等性）。
    dest_key = f"{key}" if version_id is None else f"{key}"
    moved = _idempotent_copy(src_bucket, key, version_id, dest, dest_key)

    # 脅威・スキャン不可は人間に通知（fail-closed の確認とトリアージ）。
    if result != CLEAN_STATUS:
        _alert(src_bucket, key, result, threats, disposition)

    logger.info(json.dumps({**log, "disposition": disposition, "copied": moved}))
    return {"disposition": disposition, "result": result}


def _idempotent_copy(
    src_bucket: str, key: str, version_id: str | None, dest_bucket: str, dest_key: str
) -> bool:
    """landing から dest へコピー（冪等）。既に同じ版がコピー済みなら no-op。

    冪等キー: 宛先に '元の versionId' をメタデータとして書き、再実行時に一致したらスキップ。
    landing の原本は消さない（誤判定からの復旧余地を残す＝取り消し可能）。
    """
    # 既にコピー済みかを確認（同じ source version なら 2 回目はスキップ）。
    try:
        head = s3.head_object(Bucket=dest_bucket, Key=dest_key)
        if head.get("Metadata", {}).get("source-version-id") == (version_id or ""):
            return False  # 既に同じ版を処理済み → no-op
    except ClientError as exc:
        if exc.response["Error"]["Code"] not in ("404", "NoSuchKey"):
            raise  # 想定外のエラーは握りつぶさない

    copy_source: dict[str, str] = {"Bucket": src_bucket, "Key": key}
    if version_id:
        copy_source["VersionId"] = version_id

    s3.copy_object(
        Bucket=dest_bucket,
        Key=dest_key,
        CopySource=copy_source,
        # 元バージョンを冪等キーとして残す。MetadataDirective=REPLACE で確実に書く。
        Metadata={"source-version-id": version_id or ""},
        MetadataDirective="REPLACE",
    )
    return True


def _alert(
    bucket: str, key: str, result: str, threats: list[dict[str, str]] | None, disposition: str
) -> None:
    """脅威・スキャン不可をセキュリティ担当へ通知。中身は読まない・載せない。"""
    threat_names = ", ".join(t.get("name", "?") for t in (threats or [])) or "n/a"
    sns.publish(
        TopicArn=ALERT_TOPIC_ARN,
        Subject=f"[S3 Malware][{result}] {bucket}/{key}",
        Message="\n".join(
            [
                f"bucket: {bucket}",
                f"key: {key}",
                f"scanResultStatus: {result}",
                f"threats: {threat_names}",
                f"disposition: {disposition}",
                "note: 'NO_THREATS_FOUND' 以外は安全とみなさず隔離済み。要トリアージ。",
            ]
        ),
    )
```

Let me make explicit the design decisions of this code.

- **fail-closed (fall to the safe side)**: what gets promoted is the **single one** of `NO_THREATS_FOUND`. `UNSUPPORTED`, `ACCESS_DENIED`, `FAILED`, and even if an unknown value comes, **all fall to quarantine.** "Treat what couldn't be judged as dangerous" — this is the default of security.
- **Idempotent**: leave the original `versionId` as metadata on the destination, and detect with `head_object` on the second time onward to no-op. **Even if the same result comes multiple times with at-least-once, the copy is one time's worth.** This is the same shape as making **the same event received twice billed once** on the payment platform.
- **Reversible**: promote / isolate by **copying without deleting** the `landing` original. Even if a misjudgment (false positive) is found later, the original remains, so you can revert.
- **Least privilege**: this Lambda's execution role is limited to `s3:GetObject*` on `landing`, `s3:PutObject*` + `head_object` on `clean`/`quarantine`, and `sns:Publish`, with `Resource` narrowed to each bucket ARN. Because it spans buckets, also explicitly allow this role on the **bucket-policy side** of each bucket.
- **Don't handle contents**: the Lambda **doesn't read the object's bytes** (the copy is S3 server-side `copy_object`). It puts the threat name in the notification, but doesn't emit the file contents to logs or the notification (reconciling observability and security).

> **Wiring the EventBridge rule and the Lambda**: create a rule that picks up `detail-type = "GuardDuty Malware Protection Object Scan Result"`, put the Lambda as the target, and attach a **retry + DLQ** (the same pattern as the `retry_policy` + `dead_letter_config` of the [automated-response article](/blog/aws-guardduty-eventbridge-automated-remediation-incident-response-guide)). A non-empty DLQ = there's an object whose result couldn't be processed = a dangerous silence, so always make it an alert target.

### 5.4 Terraform: attach the Malware Protection plan to the landing bucket

With the `aws_guardduty_malware_protection_plan` resource, protect only the `uploads/` prefix of the `landing` bucket and enable result tagging.

```hcl
# landing バケットに Malware Protection for S3 を有効化する。
# role = GuardDuty が assume してスキャン・タグ付けするための IAM ロール ARN。
resource "aws_guardduty_malware_protection_plan" "landing" {
  role = aws_iam_role.gd_malware_s3.arn

  protected_resource {
    s3_bucket {
      bucket_name = aws_s3_bucket.landing.id
      # スキャン対象をアップロード受け口に限定（最大5プレフィックス）。
      # 受け口を絞ることで、無関係なオブジェクトのスキャン課金を避ける。
      object_prefixes = ["uploads/"]
    }
  }

  # スキャン結果をオブジェクトタグ(GuardDutyMalwareScanStatus)に書く。
  # 5.2 の TBAC ポリシーはこのタグに依存するので ENABLED 必須。
  actions {
    tagging {
      status = "ENABLED"
    }
  }

  tags = { ManagedBy = "terraform", Purpose = "upload-malware-scan" }
}
```

> **A standalone-operation note**: this `aws_guardduty_malware_protection_plan` can be created **even without `aws_guardduty_detector`** — this is the Terraform expression of "standalone mode." If you want to make it with GuardDuty, separately enable `aws_guardduty_detector` (and the `MALWARE_PROTECTION`-family feature if you like), and detection will also appear as a finding. For the trust policy and permissions of the IAM role passed to `role` (`s3:GetObject` / `kms:Decrypt` / `s3:PutObjectTagging` + for the EventBridge managed rule), minimize per the official template as in 3.2.

---

## 6. Cost: usage-based, the free tier, standalone vs. with GuardDuty

### 6.1 The billing model (verify the latest officially)

Malware Protection for S3's pricing is usage-based, **different** from other protection plans. Grasp the concept and you can read the budget.

| Billing target | Billing unit | In the free tier? |
| --- | --- | --- |
| Scanned data volume | **Per GB** | Up to **1 GB** per month free |
| Objects evaluated | **Per request (object)** | Up to **1,000 requests** per month free |
| S3 object tagging | S3's tagging cost | **Not in the free tier** |
| The S3 APIs GuardDuty hits (GET/PUT etc.) | S3's API cost | (S3-side normal billing) |

The official free tier is **"per account, per Region, up to 1,000 requests + 1 GB of data scanned per month free."** Usage-based billing starts from the portion exceeding this. Note that **on-demand scanning and tagging are not in the free tier.**

> **Always confirm the monetary figures with the latest official values**: in this article, I **deliberately don't assert** specific unit prices (USD/GB, USD/1,000 objects, etc.). Because pricing **differs by Region and gets revised**, it's correct to estimate **the latest value for your Region** on the [GuardDuty pricing page](https://aws.amazon.com/guardduty/pricing/). When you look at the US East (N. Virginia) figures too, treat them as **a reference (verify needed).** The cost-optimization story is dug into separately in the [GuardDuty cost-optimization article](/blog/aws-guardduty-cost-optimization-pricing-finops-guide).

### 6.2 The cost implications of standalone vs. with GuardDuty

- **Standalone mode**: you pay only **Malware Protection for S3's usage-based billing.** No cost of GuardDuty itself (foundational detection, other protection plans) is incurred. You can answer the requirement of "just upload quarantine" at **minimal cost.**
- **With GuardDuty**: on top of Malware Protection for S3's usage-based billing, **the cost of GuardDuty itself and other enabled plans** rides on. But you gain the value of malware detection **riding onto existing incident response / correlation as a finding.**

> **The crux of cost design**: scoping by prefix (just `uploads/`), and **limiting the scan target to "the receiver where users actually upload,"** directly minimizes billing. Carelessly protect the whole bucket and it scans even temporary objects generated by internal processing, which can **swell the object-evaluation count.** Creating "billing proportional to assets" is the work of design here too.

---

## 7. Summary: a Malware Protection for S3 production cheat sheet

A quick-reference table for when you're unsure.

- **Don't mix up the 2 features**: **S3 Protection** = detect **suspicious access / exfiltration** with CloudTrail's S3 data events (GuardDuty required, feature `S3_DATA_EVENTS`). **Malware Protection for S3** = **malware-scan the contents** of new uploads. "Who did what" vs. "what came in." And it's also different from **for EC2** (EBS scanning).
- **Standalone-operable**: you can enable just Malware Protection for S3 without GuardDuty itself. But **no detector ID = no GuardDuty finding generated on malware detection.** The result is only the **EventBridge default bus + CloudWatch + (optional) tags.** A "finding-premised" alert doesn't ring in standalone mode.
- **Mechanism**: a per-bucket **Malware Protection plan.** Auto-scan on upload (`PutObject`, etc.). **Own account, same Region only**; even a delegated administrator can't do a member's bucket. **Scope to up to 5 by prefix**, KMS-supported (**SSE-C unsupported**). Existing/re-scan is **`SendObjectMalwareScan`** (on-demand, not in the free tier).
- **Limits (verify the latest)**: max object **100 GB**, extracted files **100,000**, nesting **100**, protected buckets **25/account/Region**. Exceeding it is `UNSUPPORTED`.
- **Result**: the tag `GuardDutyMalwareScanStatus` = `NO_THREATS_FOUND` / `THREATS_FOUND` / `UNSUPPORTED` / `ACCESS_DENIED` / `FAILED`. Tagging must be **enabled before upload.** EventBridge is `detail-type="GuardDuty Malware Protection Object Scan Result"`, **at-least-once → idempotency required.** `scanStatus` (status) and `scanResultStatus` (result) are different things.
- **The safe pipeline**: landing (protection + tags) → EventBridge → **an idempotent Lambda promotes only `NO_THREATS_FOUND` to clean, and quarantines everything else (fail-closed).** Downstream reads only clean. **Use a TBAC bucket policy to "DENY the GetObject of objects without the clean tag,"** and forbid tag tampering by anyone but GuardDuty (+ an SCP if in an organization).
- **Cost (verify the latest)**: usage-based on **scanned GB + objects evaluated.** **1,000 requests + 1 GB** per month free (on-demand / tagging not in it). Scoping by prefix is itself the saving. Standalone mode has zero GuardDuty cost.

Malware Protection for S3 isn't "put it in the box and it's safe on its own"; its value is decided by **"whether you can turn the scan result (especially other than `NO_THREATS_FOUND`) into plumbing that's idempotent, reversible, and fail-closed."** The greatest leverage is in the **design of the boundary (EventBridge + TBAC) that physically cuts off contamination from downstream and flows only clean**, more than the detection itself.

On a multi-account [serverless payment platform](/case-studies/payment-platform-reliability), I **implemented IAM, observability, and DR across a platform handling actual money, carbon credits, and regional currencies**, and ensured "correctness" with **the structure of code and idempotency** rather than operational vigilance — the idea of making **the same event received twice have a one-time side effect** in an at-least-once world can be diverted directly to processing scan results. **I have no intention of claiming I operated Malware Protection for S3 in a specific client project.** But this "secure upload pipeline (standalone operation, TBAC gating, idempotent promotion / isolation)" — based on the above real experience, I can **design, implement, and deliver it.**

**"How do I build a malware-scanning quarantine into my company's upload feature — start standalone or put it on GuardDuty itself, how to protect downstream with TBAC, how to minimize cost." From the requirements-organizing stage through the implementation of Terraform / Lambda / bucket policies, I can accompany you fast and safely, one person × generative AI (Claude Code).** Feel free to consult me.

---

### Reference (official documentation)

- [GuardDuty Malware Protection for S3](https://docs.aws.amazon.com/guardduty/latest/ug/gdu-malware-protection-s3.html) — the feature overview, the two enablement approaches (with GuardDuty / standalone), the reason no finding is generated in standalone mode, the own-account & same-Region constraint
- [How does Malware Protection for S3 work?](https://docs.aws.amazon.com/guardduty/latest/ug/how-malware-protection-for-s3-gdu-works.html) — the Malware Protection plan, the IAM role, prefixes, KMS decryption, the tag key `GuardDutyMalwareScanStatus`, at-least-once delivery
- [Monitoring S3 object scans in Malware Protection for S3](https://docs.aws.amazon.com/guardduty/latest/ug/monitoring-malware-protection-s3-scans-gdu.html) — the exact values of scanStatus and scanResultStatus, the list of `statusReasons`
- [Monitoring S3 object scans with Amazon EventBridge](https://docs.aws.amazon.com/guardduty/latest/ug/monitor-with-eventbridge-s3-malware-protection.html) — the complete JSON schema of `detail-type="GuardDuty Malware Protection Object Scan Result"`
- [Using tag-based access control (TBAC) with Malware Protection for S3](https://docs.aws.amazon.com/guardduty/latest/ug/tag-based-access-s3-malware-protection.html) — the official template of the S3 bucket policy that DENYs unless `NO_THREATS_FOUND`, tag-tampering prevention
- [On-demand S3 malware scan in GuardDuty](https://docs.aws.amazon.com/guardduty/latest/ug/malware-protection-s3-on-demand.html) — the `SendObjectMalwareScan` API, existing/re-scan, not in the free tier
- [Quotas in Malware Protection for S3](https://docs.aws.amazon.com/guardduty/latest/ug/malware-protection-s3-quotas-guardduty.html) — max object 100 GB, extracted files 100,000, nesting 100, protected buckets 25
- [Pricing and usage cost for Malware Protection for S3](https://docs.aws.amazon.com/guardduty/latest/ug/pricing-malware-protection-for-s3-guardduty.html) — the free tier (1,000 requests + 1 GB/month), tagging / on-demand not in it
- [GuardDuty S3 Protection](https://docs.aws.amazon.com/guardduty/latest/ug/s3-protection.html) — CloudTrail S3 data-event monitoring (different from Malware Protection for S3)
- [Terraform: aws_guardduty_malware_protection_plan](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/guardduty_malware_protection_plan) — `role` / `protected_resource { s3_bucket { bucket_name, object_prefixes } }` / `actions { tagging { status } }`
