# The Complete AWS CloudTrail Guide (2026 Edition): Designing API Activity Auditing, Trails, CloudTrail Lake, Athena Analysis, and Real-Time Detection at Production Quality

> AWS CloudTrail explained faithfully to the official docs. From the four event types (management/data/Insights/network activity) and the difference between event history vs. a trail, to the Terraform initial setup of a multi-region trail, SSE-KMS encryption and log-integrity validation, real-time detection and long-term investigation with EventBridge/CloudWatch/Athena, the current state of CloudTrail Lake (Trino SQL), the pricing pitfalls and cost optimization, and the 13 official security best practices — all with real code.

- Published: 2026-06-27
- Author: 友田 陽大
- Tags: AWS, CloudTrail, セキュリティ, 監査ログ, Terraform, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/aws-cloudtrail-audit-logging-governance-security-guide
- Category: AWS CloudTrail audit & governance

## Key points

- CloudTrail is a ledger that records 'who called which AWS API, when, and from where.' Event history is the past 90 days, management events only, and free — but permanent retention and capturing all event types require a trail
- There are four event types (management / data / Insights / network activity). The first copy of management events is free per region, while data events are billed from the first copy — this is the dividing line of cost and design
- The production initial setup is 'multi-region trail + SSE-KMS encryption + log-integrity validation + least-privilege bucket policy.' Pin it declaratively with Terraform
- Detection is near-real-time via EventBridge→Lambda/SNS, long-term investigation is via Athena (cut scan volume with partition projection), and tamper detection is guaranteed by log-integrity validation (SHA-256/digest)
- CloudTrail Lake stops accepting new customers as of May 31, 2026 (existing customers can keep using it). For new builds, Athena + S3 becomes the realistic analysis foundation

---

"**This production setting — who changed it, and when?**" — whether you can answer this question, the first to fly on the night of an incident, instantly changes the time to recovery.

For a system where "money moves," like a payment platform, this is not a matter of spirit. Whether you can keep, in a tamper-proof form, **when, who, with what permission, from where, called which AWS API** is the starting point for everything: compliance audits, fraud investigation, and root-cause analysis of failures. I designed and led the [reliability layer of a serverless (Lambda + DynamoDB) payment platform](/case-studies/payment-platform-reliability) and maintained **zero double-charges in production**, but underlying that is the discipline of building, from the start, a state where "correctness can be proven with code and an audit trail." Its core is **AWS CloudTrail**.

This article is an implementation guide for designing and operating CloudTrail at **production quality**. Without ending at "just enable it," we'll assemble it end-to-end — **multi-region trail, encryption, integrity validation, real-time detection, long-term investigation, and cost optimization** — with real Terraform / TypeScript / SQL code.

> **The rules of this article**: specs, pricing, conditions, and feature statuses are all cross-checked against the **AWS official docs (as of June 2026)**. Pricing and managed features in particular are revised fast, so always check the [official pricing page](https://aws.amazon.com/cloudtrail/pricing/) and the latest docs before going to production. Account IDs (`111122223333`), bucket names, and regions are illustrative.

---

## 0. Mental Model: CloudTrail Is the "API Ledger of an AWS Account"

Before starting design, let's fix in one line what CloudTrail is and is not.

> **CloudTrail = a service for governance, compliance, operational auditing, and risk auditing that records operations performed within an AWS account as "events."** Operations via the console, CLI, SDK, or API — wherever they come through — are recorded.

The official definition is exactly this.

> Actions taken by a user, role, or an AWS service are recorded as events in CloudTrail. Events include actions taken in the AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs.

From this come three consequences that matter in the field.

1. **CloudTrail is not application observability.** Where OpenTelemetry looks at "what happened inside the app (traces, metrics, logs)," CloudTrail looks at "**who hit which API on the AWS management and data planes**." The two are complementary, and they're different things (for the app side's three pillars, go to [Observability with OpenTelemetry](/blog/aws-observability-opentelemetry-sre-ecs)).
2. **CloudTrail doesn't line up logs "in order."** As the docs state explicitly, logs are not a stack trace, and events do not appear in a particular order. You sort by `eventTime` yourself to track the timeline.
3. **"Enabled" and "usable as evidence" are different.** As shown later, the default **event history is 90 days, management events only**. Permanent retention, tamper detection, and capturing all event types require you to **create a trail yourself**.

---

## 1. The Overall Map: The Four Event Types and "Event History vs. Trail"

The shortest route to understanding CloudTrail is to grasp two axes separately: **"what gets recorded (event type)"** and **"where it accumulates (event history / trail / Lake)."**

### 1-1. The Four Event Types That Get Recorded

The docs define four kinds. **By default, only management events are recorded; data / Insights / network activity are not.**

| Event type | What it records | Default | Billing |
| --- | --- | --- | --- |
| **Management** | Control-plane operations (`RunInstances`, `CreateUser`, `ConsoleLogin`, etc.). Read/write selectable separately | **Recording ON** | The first copy is free per region |
| **Data** | Data-plane operations (S3 object `GetObject`, Lambda `Invoke`, DynamoDB `PutItem`, etc.). **High volume** | OFF | Billed from the first copy |
| **Insights** | **Anomaly detection** of API call rate / error rate. Continuously analyzes both management and data | OFF | Billed per analysis target |
| **Network activity** | API activity via **VPC endpoints**. Detects approach by credentials from outside the org | OFF | Billed |

> Network activity events are a relatively new feature that **went GA in February 2025** (starting with 5 services — S3, EC2, KMS, Secrets Manager, CloudTrail — with the supported services continually expanding). Because you can see "who is using a VPC endpoint, from where," it's effective for detective controls of a data perimeter.

### 1-2. Where It Accumulates: Event History, Trail, CloudTrail Lake

**This is the biggest misconception point.** "CloudTrail is on by default" is half right and half dangerous.

- **Event history** — automatically on and free from account creation. But the constraints are strong.

  > The Event history provides a viewable, searchable, downloadable, and immutable record of the **past 90 days of management events** in an AWS Region.

  That is, "**the past 90 days, management events only, a single region**." Data events are not included, and it vanishes after 90 days. **It is not a permanent record for use as evidence.**

- **Trail** — the setting that "**continuously delivers events to S3**" (optionally to CloudWatch Logs / EventBridge too). Retention beyond 90 days, data events, integrity validation, and all-region aggregation all presuppose a trail. **This is what to do first in production.**

- **CloudTrail Lake** — a managed audit data lake you can query with **Trino SQL**. As discussed later, it **stopped accepting new customers as of May 31, 2026** (existing customers can keep using it). For a new build, **Athena + S3** becomes the realistic option.

> **The starting point of design**: don't rely on event history. **Create one multi-region trail** and deliver it permanently to S3. This is the foundation. In the next chapter we'll pin down an "unbreakable initial setup" with Terraform.

---

## 2. The First Step: A Production-Quality Initial Setup of a Multi-Region Trail with Terraform

Creating a trail is instantaneous, but to make it **production quality** you need to satisfy the following five from the start. These are exactly the official security best practices (the full list is organized in §8).

1. **Multi-region** (don't drop events from any region)
2. **SSE-KMS encryption** (confidentiality at rest)
3. **Log-file integrity validation** (detect tampering / deletion)
4. **Least-privilege S3 bucket policy** (restrict to the trail with `aws:SourceArn`)
5. **CloudWatch Logs integration** (the foundation of real-time monitoring)

> There's a trap where **the default differs between the console and CLI/API**. The docs state plainly "**All trails created using the CloudTrail console are multi-Region trails**," while **creating via CLI/API or Terraform defaults to single-region**. So in IaC you must **explicitly** set `is_multi_region_trail = true`.

### 2-1. The Delivery S3 Bucket (Versioning, Public Blocking, Encryption)

```hcl
# 監査ログ専用バケット。本来は「ログアーカイブ専用アカウント」に隔離するのが理想（§8）。
resource "aws_s3_bucket" "trail" {
  bucket = "prod-audit-trail-111122223333"
}

# 改ざん・誤削除に備えてバージョニングは必須
resource "aws_s3_bucket_versioning" "trail" {
  bucket = aws_s3_bucket.trail.id
  versioning_configuration { status = "Enabled" }
}

# 監査ログが公開されることは絶対にあってはならない
resource "aws_s3_bucket_public_access_block" "trail" {
  bucket                  = aws_s3_bucket.trail.id
  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

# S3側の既定暗号化（CloudTrail自身もKMSで暗号化するが、多層で固める）
resource "aws_s3_bucket_server_side_encryption_configuration" "trail" {
  bucket = aws_s3_bucket.trail.id
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm     = "aws:kms"
      kms_master_key_id = aws_kms_key.trail.arn
    }
    bucket_key_enabled = true # KMSリクエストを集約してコスト削減
  }
}
```

### 2-2. The Least-Privilege Bucket Policy (Constrain the Trail with `aws:SourceArn`)

For CloudTrail to write to the bucket, it needs two permissions: an ACL check and the write. The best practice is to **restrict to "only writes from this trail" with the `aws:SourceArn` condition** (preventing the confused-deputy problem).

```hcl
data "aws_caller_identity" "current" {}

locals {
  trail_arn = "arn:aws:cloudtrail:us-east-1:${data.aws_caller_identity.current.account_id}:trail/org-audit-trail"
}

data "aws_iam_policy_document" "trail_bucket" {
  # ① CloudTrail がバケットACLを確認する許可
  statement {
    sid       = "AWSCloudTrailAclCheck"
    actions   = ["s3:GetBucketAcl"]
    resources = [aws_s3_bucket.trail.arn]
    principals {
      type        = "Service"
      identifiers = ["cloudtrail.amazonaws.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "aws:SourceArn"
      values   = [local.trail_arn]
    }
  }

  # ② ログオブジェクトの書き込み許可（bucket-owner-full-control 必須）
  statement {
    sid       = "AWSCloudTrailWrite"
    actions   = ["s3:PutObject"]
    resources = ["${aws_s3_bucket.trail.arn}/AWSLogs/${data.aws_caller_identity.current.account_id}/*"]
    principals {
      type        = "Service"
      identifiers = ["cloudtrail.amazonaws.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "s3:x-amz-acl"
      values   = ["bucket-owner-full-control"]
    }
    condition {
      test     = "StringEquals"
      variable = "aws:SourceArn"
      values   = [local.trail_arn]
    }
  }
}

resource "aws_s3_bucket_policy" "trail" {
  bucket = aws_s3_bucket.trail.id
  policy = data.aws_iam_policy_document.trail_bucket.json
}
```

> When making it an organization (AWS Organizations) trail, `②`'s resource path becomes the **organization ID path** (`AWSLogs/o-xxxxxxxxxx/<account>/...`) rather than the account ID. This is easy to forget when using `is_organization_trail = true`.

### 2-3. The KMS Key (Allow CloudTrail to Encrypt)

For SSE-KMS, **allow `cloudtrail.amazonaws.com` to encrypt in the key policy**. With the double condition of `kms:EncryptionContext` and `aws:SourceArn`, don't let unrelated services use the key.

```hcl
data "aws_iam_policy_document" "trail_kms" {
  # アカウント管理者（鍵の管理権限）
  statement {
    sid       = "EnableRoot"
    actions   = ["kms:*"]
    resources = ["*"]
    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:root"]
    }
  }

  # CloudTrail にデータキー生成を許可（証跡ARNと暗号化コンテキストで限定）
  statement {
    sid       = "AllowCloudTrailEncrypt"
    actions   = ["kms:GenerateDataKey*"]
    resources = ["*"]
    principals {
      type        = "Service"
      identifiers = ["cloudtrail.amazonaws.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "aws:SourceArn"
      values   = [local.trail_arn]
    }
    condition {
      test     = "StringLike"
      variable = "kms:EncryptionContext:aws:cloudtrail:arn"
      values   = ["arn:aws:cloudtrail:*:${data.aws_caller_identity.current.account_id}:trail/*"]
    }
  }

  # ログ読者がKMSで復号できるように（最小権限で）
  statement {
    sid       = "AllowDecryptForReaders"
    actions   = ["kms:Decrypt", "kms:DescribeKey"]
    resources = ["*"]
    principals {
      type        = "AWS"
      identifiers = ["arn:aws:iam::${data.aws_caller_identity.current.account_id}:role/SecurityAuditor"]
    }
  }
}

resource "aws_kms_key" "trail" {
  description             = "CloudTrail log encryption key"
  enable_key_rotation     = true # 年次自動ローテーション
  deletion_window_in_days = 30
  policy                  = data.aws_iam_policy_document.trail_kms.json
}
```

### 2-4. The Trail Itself (Multi-Region, Integrity Validation, CloudWatch Logs Integration)

```hcl
resource "aws_cloudwatch_log_group" "trail" {
  name              = "/aws/cloudtrail/org-audit"
  retention_in_days = 365 # CloudWatch Logs側の保持（S3とは別管理）
}

# CloudTrail が CloudWatch Logs に書き込むためのロール（最小権限・割愛気味に提示）
resource "aws_iam_role" "cloudtrail_cw" {
  name = "CloudTrail_CloudWatchLogs_Role"
  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "cloudtrail.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy" "cloudtrail_cw" {
  role = aws_iam_role.cloudtrail_cw.id
  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["logs:CreateLogStream", "logs:PutLogEvents"]
      Resource = "${aws_cloudwatch_log_group.trail.arn}:*"
    }]
  })
}

resource "aws_cloudtrail" "org_audit" {
  name           = "org-audit-trail"
  s3_bucket_name = aws_s3_bucket.trail.id
  kms_key_id     = aws_kms_key.trail.arn

  is_multi_region_trail         = true # CLI/Terraform既定は単一リージョン。必ず明示！
  include_global_service_events = true # IAM等グローバルサービスのイベントも取得
  enable_log_file_validation    = true # ★整合性検証（digest生成）をON
  enable_logging                = true

  cloud_watch_logs_group_arn = "${aws_cloudwatch_log_group.trail.arn}:*"
  cloud_watch_logs_role_arn  = aws_iam_role.cloudtrail_cw.arn

  # is_organization_trail = true # AWS Organizations管理アカウントで全メンバーに適用

  # バケットポリシーが先に存在しないと作成が失敗するため明示依存
  depends_on = [aws_s3_bucket_policy.trail]
}
```

With this, we have the foundation to "**deliver management events from all regions permanently to S3 and CloudWatch Logs, with encryption + tamper detection**." Per the docs, delivery from the trail to S3 takes **about 5 minutes on average** (not a guaranteed value).

---

## 3. How to Read a Log: The Record JSON and `userIdentity` (the Starting Point of Forensics)

Both detection and investigation come down, in the end, to **whether you can correctly read a single event's JSON**. Below is an example of an event that should never happen — "**the root user logged into the console without MFA, from an unfamiliar IP**" (the current format is `eventVersion` 1.11).

```json
{
  "eventVersion": "1.11",
  "userIdentity": {
    "type": "Root",
    "principalId": "111122223333",
    "arn": "arn:aws:iam::111122223333:root",
    "accountId": "111122223333"
  },
  "eventTime": "2026-06-27T02:14:51Z",
  "eventSource": "signin.amazonaws.com",
  "eventName": "ConsoleLogin",
  "awsRegion": "us-east-1",
  "sourceIPAddress": "203.0.113.42",
  "userAgent": "Mozilla/5.0 ...",
  "requestParameters": null,
  "responseElements": { "ConsoleLogin": "Success" },
  "additionalEventData": { "MFAUsed": "No" },
  "eventID": "8a9b0c1d-2e3f-4a5b-6c7d-8e9f0a1b2c3d",
  "eventType": "AwsConsoleSignIn",
  "recipientAccountId": "111122223333"
}
```

Nail down the fields to read, along with the official definitions.

| Field | Meaning (per official) | Use in investigation |
| --- | --- | --- |
| `userIdentity` | "**Who**" called it. IAM identity information | The star. Drilled into in the table below |
| `eventSource` / `eventName` | "**Which service's, which operation**" (`iam.amazonaws.com` / `CreateUser`, etc.) | Identifying the operation; the axis for filters |
| `eventTime` | Request completion time (UTC) | Reconstructing the timeline (sort key) |
| `sourceIPAddress` | Request source IP (AWS-internal shows `AWS Internal`) | Identifying suspicious sources |
| `errorCode` / `errorMessage` | The code and description on error (`AccessDenied`, etc.) | Signs of attack / insufficient permission |
| `readOnly` | Whether it's a read-only operation (true/false) | Extract only "change-type" operations |
| `eventCategory` | `Management` / `Data` / `NetworkActivity` | Sorting by type |
| `recipientAccountId` | The account that received this event | Detecting cross-account operations |
| `tlsDetails` | TLS version, cipher suite, FQDN | Inventorying old TLS connections |
| `sessionCredentialFromConsole` | Whether it's from a console session (shown only when true) | Distinguishing human vs. automation |

### `userIdentity.type`: Memorize the Correct Spelling

The value of `type`, which represents "who," is the key for investigation queries. The official values (current), precisely:

| type | What it is |
| --- | --- |
| `Root` | The root user. **Must not appear in normal operation** |
| `IAMUser` | An IAM user |
| `AssumedRole` | A session that assumed a role (carries `sessionContext`) |
| `Role` | A service role, etc. |
| `FederatedUser` | STS federation |
| `AWSService` | An AWS service acting on behalf |
| `AWSAccount` | Another account |
| `IdentityCenterUser` | An IAM Identity Center user (**not `IAMIdentityCenter`**) |
| `SAMLUser` / `WebIdentityUser` | SAML / Web identity federation |

When it's `AssumedRole`, `userIdentity.sessionContext.sessionIssuer` (from which role) and `sessionContext.attributes.mfaAuthenticated` (MFA or not) are decisively important. "**Whose role, assumed as a session with MFA**" is found out here.

---

## 4. Battle by Scenario: Detection, Monitoring, Investigation, Data Events

Once the foundation is built, change CloudTrail from "**just sitting there**" to "**a working audit foundation**." Implement four patterns by use.

### 4.1 Real-Time Detection: EventBridge → Lambda/SNS

**The highest-value detection is the moment "the attacker comes to stop the trail itself."** The first thing they do after intrusion is stop the trail (destroying evidence). Detect this instantly via `StopLogging` / `DeleteTrail` / `UpdateTrail` / `PutEventSelectors`.

EventBridge can react in **near real time** to API calls CloudTrail recorded (the detail-type is `AWS API Call via CloudTrail`; console sign-in is `AWS Console Sign In via CloudTrail`).

```hcl
resource "aws_cloudwatch_event_rule" "trail_tampering" {
  name        = "detect-cloudtrail-tampering"
  description = "CloudTrail証跡の停止・削除・改変を即検知する"
  event_pattern = jsonencode({
    "detail-type" = ["AWS API Call via CloudTrail"]
    detail = {
      eventSource = ["cloudtrail.amazonaws.com"]
      eventName   = ["StopLogging", "DeleteTrail", "UpdateTrail", "PutEventSelectors"]
    }
  })
}

resource "aws_cloudwatch_event_target" "to_lambda" {
  rule = aws_cloudwatch_event_rule.trail_tampering.name
  arn  = aws_lambda_function.audit_alert.arn
}

resource "aws_lambda_permission" "allow_eventbridge" {
  statement_id  = "AllowExecutionFromEventBridge"
  action        = "lambda:InvokeFunction"
  function_name = aws_lambda_function.audit_alert.function_name
  principal     = "events.amazonaws.com"
  source_arn    = aws_cloudwatch_event_rule.trail_tampering.arn
}
```

If you want to detect root login, just swap the pattern.

```json
{
  "detail-type": ["AWS Console Sign In via CloudTrail"],
  "detail": { "userIdentity": { "type": ["Root"] }, "eventName": ["ConsoleLogin"] }
}
```

The receiving Lambda thoroughly applies "**don't trust even AWS-originated events at the boundary**," strictly validating only the fields it uses with Zod before shaping and notifying. It uses `eventID` (a GUID unique per event) as the dedup key to be resilient to re-delivery.

```ts
import { SNSClient, PublishCommand } from "@aws-sdk/client-sns";
import { z } from "zod";
import type { EventBridgeEvent } from "aws-lambda";

const sns = new SNSClient({});
const TOPIC_ARN = process.env.ALERT_TOPIC_ARN;
if (!TOPIC_ARN) throw new Error("ALERT_TOPIC_ARN is not set"); // 起動時に落とす

// CloudTrailレコードのうち、通知に使うフィールドだけを境界で検証する
const CloudTrailDetail = z.object({
  eventID: z.string().uuid(),
  eventName: z.string(),
  eventSource: z.string(),
  awsRegion: z.string(),
  sourceIPAddress: z.string().optional(),
  errorCode: z.string().optional(),
  userIdentity: z.object({
    type: z.string(),
    arn: z.string().optional(),
  }),
});

export const handler = async (
  event: EventBridgeEvent<"AWS API Call via CloudTrail", unknown>,
): Promise<void> => {
  const detail = CloudTrailDetail.parse(event.detail); // 不正形状なら即例外

  const actor = detail.userIdentity.arn ?? detail.userIdentity.type;
  const subject = `🚨 [監査] ${detail.eventName} by ${detail.userIdentity.type}`;
  const message = [
    `操作: ${detail.eventName} (${detail.eventSource})`,
    `実行者: ${actor}`,
    `リージョン: ${detail.awsRegion}`,
    `送信元IP: ${detail.sourceIPAddress ?? "不明"}`,
    detail.errorCode ? `結果: 失敗 (${detail.errorCode})` : "結果: 成功",
    `eventID: ${detail.eventID}`,
  ].join("\n");

  await sns.send(
    new PublishCommand({
      TopicArn: TOPIC_ARN,
      Subject: subject.slice(0, 100), // SNS Subjectは最大100文字
      Message: message,
      MessageDeduplicationId: detail.eventID, // FIFOトピック使用時の冪等キー
    }),
  );
};
```

> **The premise of detection**: to catch this reliably with EventBridge, you need **a trail logging in that region** (detecting data events in particular requires a trail). That's why we created the "multi-region trail" in §2 first. The official EventBridge tutorial also begins Step 1 with creating a trail.

### 4.2 CloudWatch Logs Metric Filters + Alarms

If you want to fire not on "notification of individual events" but on "**how many times it happened in a given period**," CloudWatch Logs metric filters + alarms fit. The premise is the setting that flows the trail to CloudWatch Logs (completed in §2-4).

The official docs **explicitly give three filters as examples** — "security group change," "console sign-in failure," and "IAM policy change." Below is the IAM-policy-change example (metric name `IAMPolicyEventCount`, fires even on one occurrence in 5 minutes).

```hcl
resource "aws_cloudwatch_log_metric_filter" "iam_policy_changes" {
  name           = "IAMPolicyChanges"
  log_group_name = aws_cloudwatch_log_group.trail.name
  pattern        = "{ ($.eventName = DeleteGroupPolicy) || ($.eventName = DeleteRolePolicy) || ($.eventName = DeleteUserPolicy) || ($.eventName = PutGroupPolicy) || ($.eventName = PutRolePolicy) || ($.eventName = PutUserPolicy) || ($.eventName = CreatePolicy) || ($.eventName = DeletePolicy) || ($.eventName = CreatePolicyVersion) || ($.eventName = DeletePolicyVersion) || ($.eventName = AttachRolePolicy) || ($.eventName = DetachRolePolicy) || ($.eventName = AttachUserPolicy) || ($.eventName = DetachUserPolicy) || ($.eventName = AttachGroupPolicy) || ($.eventName = DetachGroupPolicy) }"

  metric_transformation {
    name      = "IAMPolicyEventCount"
    namespace = "CloudTrailMetrics"
    value     = "1"
  }
}

resource "aws_cloudwatch_metric_alarm" "iam_policy_changes" {
  alarm_name          = "IAMPolicyChanges"
  namespace           = "CloudTrailMetrics"
  metric_name         = "IAMPolicyEventCount"
  comparison_operator = "GreaterThanOrEqualToThreshold"
  threshold           = 1
  evaluation_periods  = 1
  period              = 300
  statistic           = "Sum"
  alarm_actions       = [aws_sns_topic.security_alerts.arn]
}
```

> **An honest note**: the staple filter sets you often see online — "root account usage," "unauthorized API calls," "sign-in without MFA," "NACL changes," etc. — are **not on CloudTrail's official page for this topic**. Their source is the **CIS AWS Foundations Benchmark** or Security Hub controls. They're highly worth implementing (I add them too), but not conflating "the three the official docs exemplify" with "those you assemble yourself from a benchmark" is the manner of a trustworthy designer. Combine with GuardDuty / Security Hub and many of these can be detected as managed (§8).

### 4.3 Investigate Beyond 90 Days with Athena

Incident investigation's royal road is "**cross-query the raw logs accumulated in S3 when needed**." You can auto-create an Athena table from the CloudTrail console, but in production, apply **partition projection** to cut scan volume — i.e., cost and execution time.

```sql
CREATE EXTERNAL TABLE cloudtrail_logs (
  eventVersion STRING,
  userIdentity STRUCT<
    type: STRING, principalId: STRING, arn: STRING, accountId: STRING, userName: STRING,
    sessionContext: STRUCT<attributes: STRUCT<mfaAuthenticated: STRING, creationDate: STRING>>
  >,
  eventTime STRING, eventSource STRING, eventName STRING, awsRegion STRING,
  sourceIPAddress STRING, userAgent STRING, errorCode STRING, errorMessage STRING,
  requestParameters STRING, responseElements STRING, eventID STRING,
  readOnly BOOLEAN, eventType STRING, recipientAccountId STRING
)
PARTITIONED BY (`account` STRING, `region` STRING, `date` STRING)
ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'
STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://prod-audit-trail-111122223333/AWSLogs/111122223333/CloudTrail/'
TBLPROPERTIES (
  'projection.enabled' = 'true',
  'projection.account.type' = 'enum',
  'projection.account.values' = '111122223333',
  'projection.region.type' = 'enum',
  'projection.region.values' = 'us-east-1,ap-northeast-1',
  'projection.date.type' = 'date',
  'projection.date.range' = '2024/01/01,NOW',
  'projection.date.format' = 'yyyy/MM/dd',
  'projection.date.interval' = '1',
  'projection.date.interval.unit' = 'DAYS',
  'storage.location.template' =
    's3://prod-audit-trail-111122223333/AWSLogs/${account}/CloudTrail/${region}/${date}'
);
```

Three queries often used in investigation:

```sql
-- ① 「誰がこのセキュリティグループを開けたか」（直近1週間・パーティションで絞る）
SELECT eventTime, userIdentity.arn AS who, sourceIPAddress, eventName, requestParameters
FROM cloudtrail_logs
WHERE region = 'ap-northeast-1' AND date >= '2026/06/20'
  AND eventSource = 'ec2.amazonaws.com'
  AND eventName IN ('AuthorizeSecurityGroupIngress', 'RevokeSecurityGroupIngress')
ORDER BY eventTime DESC;

-- ② AccessDenied が急増しているプリンシパル（攻撃 or 権限設計ミスの兆候）
SELECT userIdentity.arn AS who, count(*) AS denied
FROM cloudtrail_logs
WHERE date >= '2026/06/01' AND errorCode = 'AccessDenied'
GROUP BY 1 ORDER BY denied DESC LIMIT 20;

-- ③ ルートアカウントの利用（本番では原則ゼロであるべき）
SELECT eventTime, eventName, sourceIPAddress
FROM cloudtrail_logs
WHERE date >= '2026/06/01' AND userIdentity.type = 'Root'
ORDER BY eventTime DESC;
```

> **The key to cost is partitions**. Athena is **billed by the amount of data scanned**. Always put the projection columns `account` / `region` / `date` in the `WHERE` so it doesn't read irrelevant partitions. This alone shrinks tens of GB of scanning to hundreds of MB.

### 4.4 Narrow Data Events Surgically with "Advanced Event Selectors"

Data events (S3 objects, Lambda Invoke, DynamoDB item operations) are **high volume and billed from the first copy**, so "all ON" is an accident. With **advanced event selectors**, narrow surgically to **only the audited resources, writes only**.

```hcl
resource "aws_cloudtrail" "data_events" {
  name           = "payments-evidence-data-trail"
  s3_bucket_name = aws_s3_bucket.trail.id
  kms_key_id     = aws_kms_key.trail.arn
  is_multi_region_trail      = true
  enable_log_file_validation = true

  # 高度なセレクタを使うと既定の「管理イベント記録」が上書きされる。
  # 管理イベントを残したいなら、明示的にManagementセレクタを足す（重要な罠）。
  advanced_event_selector {
    name = "Log all management events"
    field_selector {
      field  = "eventCategory"
      equals = ["Management"]
    }
  }

  # 証拠保管バケットへの「書き込み系オブジェクト操作」だけを記録
  advanced_event_selector {
    name = "Audit writes on the payment evidence bucket only"
    field_selector {
      field  = "eventCategory"
      equals = ["Data"]
    }
    field_selector {
      field  = "resources.type"
      equals = ["AWS::S3::Object"]
    }
    field_selector {
      field       = "resources.ARN"
      starts_with = ["arn:aws:s3:::prod-payments-evidence/"]
    }
    field_selector {
      field  = "readOnly"
      equals = ["false"] # GetObjectのような読み取りは除外してコストを抑える
    }
  }

  depends_on = [aws_s3_bucket_policy.trail]
}
```

> The basic event selectors (basic) cover only three kinds — S3 objects, Lambda, DynamoDB. **The many other resource types (RDS, SQS, SNS, Bedrock, etc., continually expanding) and field-level narrowing are exclusive to advanced selectors.** CloudTrail Lake event data stores can also use only advanced selectors.

---

## 5. CloudTrail Lake: Honestly About the Current State (New Customers Cut Off May 31, 2026)

CloudTrail Lake is a managed audit data lake you can query with **Trino SQL**. Where a trail "puts files in S3," Lake provides "**an immutable data store you can cross-analyze with SQL**."

But — **since this concerns the article's trustworthiness, I'll write it honestly**. The official docs, as of June 2026, state clearly:

> AWS CloudTrail Lake will no longer be open to new customers starting May 31, 2026. If you would like to use CloudTrail Lake, sign up prior to that date. Existing customers can continue to use the service as normal.

That is, **new sign-ups end as of May 31, 2026**. **Existing customers can keep using it as before**, but if you're newly building an audit-analysis foundation, the realistic choice is the **Athena + S3 of §4.3**. Articles that unconditionally recommend "use CloudTrail Lake" haven't accounted for this change.

For existing customers, let me nail down only the key points of Lake.

- **Event Data Store (EDS)** — an **immutable** collection of events chosen with advanced selectors. Encrypted by CloudTrail by default.
- **Retention period** — with "one-year extendable," **default 366 days, max 3,653 days (about 10 years)**; with "seven-year," **about 2,557 days (about 7 years)**. Supports long-term compliance retention.
- **SQL** — fully leverage Trino's `SELECT` syntax and functions. `JOIN` across multiple EDSs is also possible.
- **Natural-language query via generative AI (query generator)** — generates immediately usable SQL from an English prompt (**GA**). Meanwhile, the query-result summarization feature is in **preview** — these two differ in status, so don't conflate them.

```sql
-- Lake（Trino）で「証跡が止められた瞬間」を横断検索する例
SELECT eventTime, userIdentity.arn, eventName, sourceIPAddress
FROM <event_data_store_id>
WHERE eventName IN ('StopLogging', 'DeleteTrail')
  AND eventTime > '2026-06-01 00:00:00'
ORDER BY eventTime DESC;
```

---

## 6. Log-File Integrity Validation: Guaranteeing Non-Repudiation

An audit log being "**there**" alone is insufficient; you need to be able to prove "**neither tampered with nor deleted**." What guarantees this is **log-file integrity validation** (already done in §2-4 with `enable_log_file_validation = true`).

The mechanism is solid.

- CloudTrail computes a **hash** of each delivered log file and, **every hour**, generates and delivers a **digest file** referencing that hour's logs.
- The digest file is **signed with CloudTrail's private key** and contains **the signature of the previous digest** — this forms a **chain** that can also detect the deletion of the digest file itself.
- The algorithms used are, per the docs, **hash = SHA-256, signature = SHA-256 with RSA**. With this, "**altering, deleting, or forging logs without being detected is computationally infeasible**."

Validation is a one-shot CLI command.

```bash
aws cloudtrail validate-logs \
  --trail-arn arn:aws:cloudtrail:us-east-1:111122223333:trail/org-audit-trail \
  --start-time 2026-06-25T00:00:00Z \
  --region us-east-1
```

> **Why it matters**. As the docs say, a validated log can **affirmatively assert** "that the log file was not altered" and "that a particular credential performed a particular API activity" — the heart of forensics and non-repudiation. When you say "we have audit logs" for a system handling payments or personal information, what truly has meaning is "we have **verifiable** audit logs."

---

## 7. The Reality of Pricing and Cost Optimization

CloudTrail can be either "nearly free" or "expensive before you notice." Nail down the boundaries precisely (**us-east-1, as of 2026**. Since revisions happen, confirm on the [official pricing page](https://aws.amazon.com/cloudtrail/pricing/)).

| Target | Price | Default |
| --- | --- | --- |
| Management events | **The first copy is free per region** / from the 2nd copy **$2.00 / 100K** | Recording ON |
| Data events | **$0.10 / 100K (billed from the first copy)** / aggregation +$0.03 / 100K | OFF |
| Insights events | management **$0.35 / 100K**, data **$0.03 / 100K** (per insight type, per analysis target) | OFF |
| Network activity | **$0.10 / 100K** | OFF |
| CloudTrail Lake ingestion | one-year extendable **$0.75/GB** (CloudTrail events) / seven-year is tiered (up to 5TB $2.5, up to 25TB $1, beyond $0.50/GB) | — |
| CloudTrail Lake query | **$0.005 / GB scanned** | — |
| S3 / CloudWatch Logs / KMS / SNS / Athena | **separately metered** by each service | — |

The official wording, precisely.

> The first copy of management events within each region is delivered free of charge. ... For data events, all deliveries incur CloudTrail costs, including the first.

From this, three cost rules that matter in the field:

1. **A single-region (or single multi-region) management-event trail is essentially nearly free.** What's billed is the S3 storage fee (usually a few cents to a few dollars a month) and, if you use KMS, a small amount of KMS request fees. **So you should not begrudge "one trail first" on cost grounds.**
2. **The "2nd copy" trap.** As in the official example, **adding a single-region trail that catches the same management events while a multi-region trail exists bills the latter**. The overlap between an organization trail and member accounts is the same. Add trails "for auditing," "for developers," and a 2nd copy piles up before you know it.
3. **Beware the explosion of KMS events.** The docs warn about this too — heavy use of SSE-KMS on S3 puts **a large volume of KMS management events** on CloudTrail and pushes up cost. You can drop the noise with **"Exclude AWS KMS events" / "Exclude Amazon RDS Data API events"** at trail creation (advanced selectors can narrow both management and data).

> Weigh "effect" against "amount" for data events and Insights, and enable them **surgically as in §4.4**. The moment you go full-open across all resources including reads, the bill changes by an order of magnitude.

---

## 8. The Security Best-Practices Checklist (Official)

The official "Security best practices in AWS CloudTrail" is organized in two lines, **Detective** and **Preventative**. You can use it as-is as the final pre-production checklist.

**Detective**

1. **Create a trail** — event history (90 days, management events only) is not a permanent record. A trail is the premise.
2. **Make it a multi-region trail** — capture all regions + global service events. Continuously monitor with the AWS Config rule `multi-region-cloud-trail-enabled`.
3. **Enable log-file integrity validation** — detect tampering, deletion, and delivery gaps with SHA-256 / SHA-256 with RSA (§6).
4. **Integrate with CloudWatch Logs** — monitoring and alerting for specific events (§4.2). Monitor with `cloud-trail-cloud-watch-logs-enabled`.
5. **Use GuardDuty** — ML-based threat detection. Continuously analyzes multiple logs including CloudTrail.
6. **Use Security Hub (CSPM)** — evaluate configuration with detective controls.

**Preventative**

7. **Aggregate into a dedicated, centralized S3 bucket** — a log-archive-dedicated account + a centralized bucket. With Organizations, an organization trail.
8. **Encrypt with SSE-KMS** — CloudTrail encrypts by default, but control the key with a CMK (§2-3). Monitor with `cloud-trail-encryption-enabled`.
9. **Add condition keys to the SNS topic policy** — add `aws:SourceArn` (optionally `aws:SourceAccount`) to prevent unauthorized access.
10. **Least privilege on the log-storage bucket** — review the bucket policy and restrict with the `aws:SourceArn` condition (§2-2).
11. **Enable S3 MFA Delete** — additional authentication for version deletion and versioning changes (not usable together with lifecycle).
12. **Object lifecycle management** — implement retention policies with lifecycle rules (e.g., move to an archive tier after one year).
13. **Restrict the grant of `AWSCloudTrail_FullAccess`** — holders of this policy can disable or reconfigure auditing. Limit to the minimum number of administrators.

> CloudTrail stands on the same philosophy as [defense-in-depth with WAF](/blog/waf-defense-in-depth-aws-waf-cloud-armor-owasp-guide), [IAM least privilege for DynamoDB](/blog/dynamodb-security-iam-fine-grained-access-control-encryption-vpc-endpoint-guide), and [keyless CI/CD with OIDC](/blog/github-actions-oidc-keyless-cicd-aws-gcp-guide) — "**don't trust the client / defend at an unbreakable layer**." Within that, CloudTrail is the last bastion of detective controls that "**proves, non-repudiably and after the fact, what happened**."

---

## 9. Summary: A CloudTrail Design Cheat Sheet

| Question | Conclusion |
| --- | --- |
| What to do first? | **One multi-region trail.** Permanent S3 delivery + SSE-KMS + integrity validation + a least-privilege bucket policy |
| Is event history enough? | NO. 90 days, management events only, a single region. **Not a permanent record** |
| What's the cost boundary? | The first copy of management events is free per region / **data events are billed from the first copy**. Beware 2nd-copy overlap and KMS-event explosion |
| Real-time detection? | **EventBridge → Lambda/SNS.** Top priority is detecting "trail stop (evidence destruction)" |
| Firing on aggregation? | **CloudWatch Logs metric filters + alarms** (official examples: SG change, sign-in failure, IAM policy change) |
| Investigating beyond 90 days? | **Athena + partition projection.** Cut scan volume (= cost) with projection columns in the `WHERE` |
| Proving it's untampered? | **Log-file integrity validation** (digest / SHA-256 / `validate-logs`). The heart of non-repudiation |
| What about CloudTrail Lake? | **New customers cut off as of 2026/5/31** (existing can continue). For new builds, Athena+S3 is the realistic answer |
| How to add data events? | **Surgically with advanced selectors.** Narrow to target resources and writes only |

CloudTrail is not a service of "enable it and you're done." Only by assembling **trail design, encryption, integrity, detection, investigation, and cost** as a single audit foundation can you answer "who did what, when" instantly even on the night of an incident, and withstand a compliance audit.

I designed, from the start, a state where "**correctness can be proven with code and an audit trail**" in a serverless payment platform, and maintained zero double-charges in production. With **one person × generative AI (Claude Code)**, I build production-quality AWS audit and security foundations like these, fast and safely, in a verifiable form. If you're struggling with AWS audit and governance design, feel free to reach out from [Contact](/contact).
