# The Difference Between AWS CloudTrail, CloudWatch, and AWS Config and How to Use Them (2026 Edition): Recording Who, What, and How It's Running with the Right Service

> An explanation faithful to the official documentation of the difference in roles among CloudTrail (who called what API = audit), CloudWatch (metrics/logs/alarms = performance and operation), and AWS Config (resource configuration and compliance state), the common misconceptions, how to combine them, the billing models of all three, and implementation examples. We make the usage distinction sink in with a real example of following one change across all three.

- Published: 2026-06-27
- Author: 友田 陽大
- Tags: AWS, CloudTrail, CloudWatch, AWS Config, 可観測性, アーキテクチャ設計
- URL: https://tomodahinata.com/en/blog/aws-cloudtrail-vs-cloudwatch-config-difference-when-to-use-guide
- Category: AWS CloudTrail audit & governance
- Pillar guide: https://tomodahinata.com/en/blog/aws-cloudtrail-audit-logging-governance-security-guide

## Key points

- CloudTrail = who called what API, when, from where (the audit log of account activity). It doesn't measure performance or metrics
- CloudWatch = how the system is running (monitoring and observability via metrics, logs, alarms, dashboards)
- AWS Config = what configuration a resource is in now, how it changed in the past, and whether it's compliant (the history and evaluation of configuration items)
- The three are complementary, not competing. You can follow one change three-dimensionally with CloudTrail (who), Config (into what configuration), and CloudWatch (how to notice)
- When unsure, choose by purpose. Audit, who → CloudTrail / performance, logs, alarms → CloudWatch / configuration compliance, drift → Config

---

A production security group was opened to `0.0.0.0/0` by someone's hand during the night. When you notice it the next morning, which service's console do you open first?

If you want to know "who did it," CloudTrail. If you want to know "what's the setting now, and does it violate company rules," AWS Config. If you're thinking "how to notice the moment it was changed in the first place," CloudWatch (and EventBridge). The same single accident leaves evidence from a different angle in each of the three services.

Do "just turn it all on" while confusing these, and you fall into the worst pattern of cost ballooning yet, when it matters, the record you need being nowhere. Conversely, assign the roles correctly, and audit, observability, and configuration compliance divide labor cleanly, and both investigation and remediation become fast.

I designed and led the reliability layer of a serverless payment platform (Lambda + DynamoDB) and maintained 0 double charges in production. From that experience, I can assert that the division-of-labor design of "which service to have record what" is exactly what decides the speed of incident response and governance. This article distills the difference among these three and how to use them, faithful to the official documentation, down to a level where you won't be lost in the field.

> The overall picture of CloudTrail itself (Terraform setup, event types, how to read JSON records, integrity verification, security best practices) is consolidated in the pillar article [CloudTrail Audit Logging, Governance, and Security Complete Guide](/blog/aws-cloudtrail-audit-logging-governance-security-guide). This article specializes in "the usage distinction among the three," so refer there for the basic setup.

## 0. First, fix the three in one line each (mental model)

90% of confusion comes from entering design while leaving these 3 lines ambiguous. Let me memorize them first.

- **CloudTrail = who, when, from where, called what API** (the audit log of account activity)
- **CloudWatch = how the system is running now** (monitoring / observability via metrics, logs, alarms, dashboards)
- **AWS Config = what configuration a resource is in now, how it changed in the past, and whether it's compliant** (snapshots and history of configuration items, rule evaluation)

The official primary definitions are this.

- AWS Config: "AWS Config provides a detailed view of the configuration of AWS resources in your AWS account. This includes how the resources are related to one another and how they were configured in the past so that you can see how the configurations and relationships change over time." ([What Is AWS Config?](https://docs.aws.amazon.com/config/latest/developerguide/WhatIsConfig.html))
- Amazon CloudWatch: "Amazon CloudWatch monitors your Amazon Web Services (AWS) resources and the applications you run on AWS in real time, and offers many tools to give you system-wide observability of your application performance, operational health, and resource utilization." ([What is Amazon CloudWatch?](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html))
- AWS CloudTrail: records API/non-API activity on the account as "events," used for operational audit, risk audit, governance, and compliance. Who, when, from where, called which API remains ([What Is AWS CloudTrail?](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-user-guide.html)).

### An instant-answer table to "which monitoring service should I use?"

Let me show, on one sheet, the correspondence of what you want to know → the service to open.

| The question you want to answer | The service to use |
| --- | --- |
| Who changed/deleted this resource | CloudTrail |
| When and from where (IP/region) was it operated | CloudTrail |
| Did this API call succeed, or was it denied | CloudTrail |
| How are CPU/memory/latency/error rate trending | CloudWatch (metrics) |
| I want to search/aggregate app or service logs | CloudWatch Logs (Logs Insights) |
| I want to notify / auto-respond when a threshold is exceeded | CloudWatch alarms |
| What configuration is this resource in now | AWS Config |
| How did this setting change compared to 1 month ago | AWS Config (configuration history timeline) |
| Is it compliant with company rules / compliance | AWS Config Rules / conformance packs |
| I want to know the dependencies between resources (SG↔EC2↔EBS) | AWS Config (relationships) |
| I want to trace a request in a distributed system | AWS X-Ray |
| I want to see the VPC's network flows | VPC Flow Logs |
| I want to detect threats / suspicious behavior | Amazon GuardDuty (takes CloudTrail etc. as input) |

If this table sinks in, you can say you've understood 80% of this article. The rest is just working out "why they get confused," "how to combine them," and "cost."

The adjacent services that appear lower in the table also complete the map when positioned in relation to the three. **X-Ray** is distributed tracing that follows requests within an application end-to-end, a piece that digs CloudWatch's observability into the app layer. **VPC Flow Logs** records L3/L4 network flows (which IP communicated to which port), a layer separate from CloudTrail's "API operations." **GuardDuty** is a threat-detection service that takes CloudTrail's logs, VPC Flow Logs, and DNS logs as **input** to find suspicious behavior — that is, GuardDuty is analysis riding on top of CloudTrail, not a replacement for CloudTrail. Hold this sense of layers, and you can think in terms of "which layer to thicken" rather than "which to put in."

Abstracting one more level, what CloudTrail looks at is mainly the control plane (management operations of creating, changing, and deleting resources; enable data events and data-plane access too), what CloudWatch looks at is the runtime behavior of the app/infra, and what Config looks at is the static state called the resource's configuration. "Operation," "behavior," "state" — memorizing the three with these 3 words is the hardest to forget.

## 1. The difference in roles in one table

Line up the three on the same axes, and it's clear they're complementary, not competing.

| Viewpoint | CloudTrail | CloudWatch | AWS Config |
| --- | --- | --- | --- |
| The true nature in one phrase | The audit log of account activity | A monitoring/observability platform | The recording and evaluation of resource configuration |
| The question it answers | Who did what API | How the system is running | What configuration it's in now and is it compliant |
| The unit of recording | An event (API/non-API activity) | Metrics, log events | A configuration item |
| The time axis | A record of the moment the operation happened | A continuous time series, log streams | Snapshot history per change |
| Main users | Security, audit, governance | SRE, development, operations | Governance, compliance, operations |
| Typical query target | "Who opened this SG" | "What caused the latency to rise" | "Is this SG a rule violation, and when did it change" |
| Does it measure performance (CPU/latency) | Doesn't measure | Measures (a core feature) | Doesn't measure |
| Does it hold the "subject (who)" of a change | Holds (userIdentity) | Basically doesn't hold | Holds the configuration but not "who" |
| Default retention/delivery | Event history is 90 days, management events only, free. Continuous storage to S3 via a Trail | Metrics/logs are billed-stored for the configured retention period | Delivers configuration history to S3, retains the timeline |
| Compliance evaluation feature | None (devotes itself to recording) | None (devotes itself to monitoring) | Has it (Config Rules / conformance packs) |

The point is that **all three use the words "audit" and "governance" in their own context.** CloudTrail's audit is "the audit of operations," Config's audit is "the audit of configuration," and CloudWatch is not audit but "monitoring." The same word points to different things, so distinguishing by "the unit of recording" rather than the word is reliable. CloudTrail is events, CloudWatch is metrics/logs, Config is configuration items. Memorize just this and you mostly won't miss.

> The difference between CloudTrail's event history (free, 90 days, management events only, not permanent) and continuous delivery to S3 via a Trail is detailed in the pillar article. In this article, it's enough to grasp just "event history is a short-term, free viewing window; continuous storage and analysis presume a Trail."

### Grasp the outline by "what it doesn't do"

When confused by feature overlap, confirming **what each service doesn't do** sharpens the outline. It's a reverse lookup for being able to say "that's not that service's job" in a design review.

| Service | What it doesn't do (don't expect this of it) |
| --- | --- |
| CloudTrail | Measuring performance, latency, error rate. Listing a resource's "current configuration." Evaluating compliant/noncompliant. Collecting an app's standard-output logs |
| CloudWatch | Identifying "who operated" (it doesn't hold subject info). Managing resource-configuration history. Compliance evaluation |
| AWS Config | Identifying "who made that change" (the operation subject is on the CloudTrail side). Measuring performance metrics. Storing app logs |

For example, "I put in Config, so I'll know who changed it" is a misconception. Config leaves "what changed" on a configuration item but doesn't hold the operation subject (who). To know "who," you need CloudTrail's `userIdentity`. Conversely, "I put in CloudTrail, so I can take inventory of how many holes are open in the account now" is also a misconception — that's Config's job, which can do cross-cutting configuration queries. Being conscious of this "boundary of the information held" naturally switches you to the mindset of using the three additively.

## 2. Why the three get confused

There are 2 clear reasons they still get mixed even when separated logically. Verbalize these, and discussion within the team meshes at once.

### Reason 1: the word "log" points to 3 different things

When someone says "I want to see the logs," it can point to at least 3 different things.

- **Operation logs** (who called what API) → CloudTrail
- **App/system logs** (standard output, access logs, Lambda execution logs) → CloudWatch Logs
- **Configuration-change logs** (how a resource's settings changed) → AWS Config's configuration history

The moment "log" becomes the subject, the conversation breaks apart in mid-air, because each speaker is picturing one of these 3. Fix it with concrete examples and it becomes this.

| What "log" points to | Concrete example | The correct place |
| --- | --- | --- |
| Operation log | "Who called DeleteBucket" | CloudTrail |
| App/system log | "Lambda spat a NullPointerException" | CloudWatch Logs |
| Configuration-change log | "When did this RDS's encryption setting change" | AWS Config configuration history |

In a design meeting, always distinguish by saying "operation log / app log / configuration history." Just this eliminates the gap in perception and dramatically reduces the back-and-forth of "where should I look."

### Reason 2: you can flow CloudTrail into CloudWatch Logs

This is the biggest source of confusion. A CloudTrail Trail can choose, as a delivery destination, not only S3 but also a **CloudWatch Logs log group.** That is, the situation of "CloudTrail's logs appear on the CloudWatch screen" normally happens. Here arises the misconception of "then aren't CloudTrail and CloudWatch the same thing?"

The correct understanding is this. **CloudTrail is the producer of the record, and CloudWatch Logs is merely one of its delivery destinations / analysis platforms.** Flowing CloudTrail into CloudWatch Logs is for "immediately notifying when a specific operation happens" with metric filters and alarms — it's not that CloudWatch doubles as CloudTrail's role. The two are "separate services that have a contact point." The implementation of this contact point is shown concretely in §4.

> Note: CloudWatch's old "CloudWatch Events" is now provided as Amazon EventBridge. When this article touches on event-driven detection, it refers to EventBridge.

### Reason 3: all three are introduced as "usable for security"

Read the official documentation and CloudTrail, Config, and CloudWatch are each written as "useful for security analysis." Config's official page also has a "Security Analysis" section, explicitly stating you can retroactively investigate past IAM policies and SG port-opening states. This "they all work for security" explanation generates the question "then what's the difference?" for a beginner.

To organize, even with the same security purpose, the angle differs. CloudTrail is "was there an illegitimate **operation**," Config is "is it in a dangerous **configuration** / what configuration was it in the past," and CloudWatch is "make it possible to **notice** an anomaly." The three are climbing the mountain called security from different trailheads; they share the summit (safe operation). So "which to put in for security" is not an either-or but, correctly, layering per purpose layer.

## 3. A real example of following one change across the three

So much for abstraction. Let me concretely see how the three each record the accident from the introduction — "someone opened a production security group to `0.0.0.0/0`." This is the core of this article.

Suppose the operation in question was this Terraform diff (or someone opened it by hand from the console — that's fine too. From CloudTrail's viewpoint, it's the same `AuthorizeSecurityGroupIngress` API).

```hcl
resource "aws_security_group_rule" "ssh_from_anywhere" {
  type              = "ingress"
  from_port         = 22
  to_port           = 22
  protocol          = "tcp"
  cidr_blocks       = ["0.0.0.0/0"] # 本来は社内CIDRのみのはずだった
  security_group_id = aws_security_group.app.id
}
```

### 3-1. CloudTrail: who, when, from where (AuthorizeSecurityGroupIngress)

This operation remains in CloudTrail as the EC2 management event `AuthorizeSecurityGroupIngress`. Look at `userIdentity` and "who" can be uniquely identified.

```json
{
  "eventVersion": "1.09",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAEXAMPLEID:alice",
    "arn": "arn:aws:sts::123456789012:assumed-role/Developer/alice",
    "accountId": "123456789012",
    "sessionContext": {
      "attributes": { "mfaAuthenticated": "false" }
    }
  },
  "eventTime": "2026-06-26T18:42:11Z",
  "eventSource": "ec2.amazonaws.com",
  "eventName": "AuthorizeSecurityGroupIngress",
  "awsRegion": "ap-northeast-1",
  "sourceIPAddress": "203.0.113.42",
  "requestParameters": {
    "groupId": "sg-0abc123def4567890",
    "ipPermissions": {
      "items": [
        {
          "ipProtocol": "tcp",
          "fromPort": 22,
          "toPort": 22,
          "ipRanges": { "items": [ { "cidrIp": "0.0.0.0/0" } ] }
        }
      ]
    }
  },
  "responseElements": { "_return": true }
}
```

What you can read from here is the fact that `alice`, who assumed the `Developer` role, fully opened the SSH port from IP `203.0.113.42` at 2026-06-26 18:42 UTC, and moreover **without MFA.** `userIdentity.type` is `AssumedRole` (mind the spelling. `Root` / `IAMUser` / `AssumedRole` / `Role` / `FederatedUser` / `AWSService` / `IdentityCenterUser` are the correct values, and a value `IAMIdentityCenter` doesn't exist).

What CloudTrail answers is this far — **who, when, from where, which API.** "So, what state is the SG in now?" "Is that a rule violation?" CloudTrail doesn't answer. That's Config's province.

### 3-2. AWS Config: what configuration it became, and is it a violation (configuration item + rule evaluation)

AWS Config records the same change as a "configuration item." Config's configuration item is officially defined as "a point-in-time view of the various attributes of a supported AWS resource," including metadata, attributes, relationships, the current configuration, and related events ([AWS Config terminology and concepts](https://docs.aws.amazon.com/config/latest/developerguide/config-concepts.html)).

That is, Config retains **the configuration itself** of "this SG is now opening port 22 to `0.0.0.0/0`," and further, from the diff with the past configuration item, **when it changed to that configuration** can be followed on a timeline. Where CloudTrail records "the point of the operation," it's good to grasp that Config records "the state of the configuration and the transition of that state."

And here's Config's inherent value. If you have Config Rules in effect, this configuration is automatically evaluated as "compliant / noncompliant." A representative managed rule that rejects full SSH opening is `restricted-ssh`. Configure it in Terraform like this.

```hcl
resource "aws_config_config_rule" "restricted_ssh" {
  name = "restricted-ssh"

  source {
    owner             = "AWS"
    source_identifier = "INCOMING_SSH_DISABLED" # restricted-ssh の識別子
  }

  # 評価対象を SG に限定
  scope {
    compliance_resource_types = ["AWS::EC2::SecurityGroup"]
  }

  depends_on = [aws_config_configuration_recorder.main]
}
```

In this state, when someone opens an SG to `0.0.0.0/0:22`, Config immediately marks that SG **NON_COMPLIANT.** It's the behavior of the official words "When AWS Config detects that a resource violates the conditions in one of your rules, AWS Config flags the resource as noncompliant and sends a notification."

When you want to take inventory of "how many noncompliant SGs are there in my account now," you can query cross-cuttingly with Config's Advanced Query (a SQL-like syntax).

```sql
SELECT
  resourceId,
  resourceName,
  configuration.ipPermissions
WHERE
  resourceType = 'AWS::EC2::SecurityGroup'
  AND configuration.ipPermissions.ipRanges = '0.0.0.0/0'
```

CloudTrail, even though you can tell "who," can't produce "how many holes are open across the whole account now." This is Config's sole domain.

Further, Config records the relationships between resources. As the official defines, "AWS Config discovers AWS resources in your account and then creates a map of relationships between AWS resources," and you can trace, for example, "which EC2 instances is this SG attached to." That is, you can grasp at once even the blast radius of "which production instances the holed SG is actually exposing." Information that never comes out of a single CloudTrail event.

And open the configuration-history timeline, and you can see the state transition in time series of "this SG was only internal CIDR until yesterday, but at 18:42 it changed to a configuration including `0.0.0.0/0`." CloudTrail's "point of the operation" and Config's "state of the configuration and its transition" become a complete story only when cross-checked like this — who (CloudTrail), at what time, into what configuration (Config's timeline), exposed which instances (Config's relationships).

### 3-3. CloudWatch / EventBridge: how to notice in that moment

CloudTrail and Config are "evidence you can follow later," but what truly takes effect in incident response is "noticing the moment it was changed." Here CloudWatch (and EventBridge) appears.

There are 2 paths.

1. **CloudTrail → CloudWatch Logs → metric filter → alarm**: flow CloudTrail into CloudWatch Logs, create a metric filter that increments a count when it matches the pattern `AuthorizeSecurityGroupIngress`, and notify with an alarm when it becomes greater than 0.
2. **EventBridge rule**: with CloudTrail's API call as the event source, directly fire a notification or a Lambda when it matches a specific API name.

The implementation of the CloudWatch path is shown in complete form in §4. What to grasp here is the division of labor. **CloudWatch handles "noticing (detection, notification, visualization)," and CloudTrail/Config handle "leaving evidence."** For the same accident, Config statically shows it as "noncompliant," and CloudWatch dynamically shouts "it just happened."

To summarize so far, against the one accident of an SG opening, the three divide labor like this.

| Service | What it leaves / does against this accident |
| --- | --- |
| CloudTrail | Who (alice/Developer role, no MFA), when, from which IP, with which API opened it |
| AWS Config | The SG's new configuration, the noncompliant judgment by the restricted-ssh rule, the timeline of when it became that configuration |
| CloudWatch / EventBridge | The metric-filter firing and alarm notification at the moment it was opened, the trigger for auto-response |

Try to substitute the three with "just one," and somewhere in this table a column always goes blank. That's why they're complementary.

Connect this accident into one investigation line, and it reads like this. CloudWatch (or EventBridge) **shouts** "the SG opened at 18:42," CloudTrail **points to the culprit** "it was alice of the Developer role, without MFA, from `203.0.113.42`," and Config **tells the damage and the course** "that SG is now noncompliant with `restricted-ssh`, exposing 3 production instances, and was not in this configuration until yesterday." Only when the three each bring out their strength does the full text of the accident — "when, who, what, into what configuration, with how much impact" — come together. In a design leaning on one service, somewhere in this sentence always has a gap.

## 4. Combination design (where to place detection)

Once you understand the roles, next is implementation. Here I show the design of "detecting a change" with the most frequent pattern of connecting CloudWatch starting from CloudTrail.

### 4-1. CloudTrail → CloudWatch Logs metric filter + alarm

First, deliver the Trail to a CloudWatch Logs log group (assuming you separately have S3 delivery as a premise. For the basic setup, see the [pillar article](/blog/aws-cloudtrail-audit-logging-governance-security-guide)). On top of that, attach a metric filter and an alarm.

```hcl
# 1) CloudTrail のログを受けるロググループ
resource "aws_cloudwatch_log_group" "trail" {
  name              = "/aws/cloudtrail/security-events"
  retention_in_days = 365
}

# 2) SG 変更を数えるメトリクスフィルタ
#    AuthorizeSecurityGroupIngress / Egress / RevokeSecurityGroup* を拾う
resource "aws_cloudwatch_log_metric_filter" "sg_changes" {
  name           = "security-group-changes"
  log_group_name = aws_cloudwatch_log_group.trail.name

  pattern = <<-PATTERN
    { ($.eventName = AuthorizeSecurityGroupIngress) ||
      ($.eventName = AuthorizeSecurityGroupEgress) ||
      ($.eventName = RevokeSecurityGroupIngress) ||
      ($.eventName = RevokeSecurityGroupEgress) ||
      ($.eventName = CreateSecurityGroup) ||
      ($.eventName = DeleteSecurityGroup) }
  PATTERN

  metric_transformation {
    name          = "SecurityGroupEventCount"
    namespace     = "CloudTrailMetrics"
    value         = "1"
    default_value = "0"
  }
}

# 3) 1件でも起きたら通知するアラーム
resource "aws_cloudwatch_metric_alarm" "sg_changes" {
  alarm_name          = "security-group-changes"
  namespace           = "CloudTrailMetrics"
  metric_name         = "SecurityGroupEventCount"
  statistic           = "Sum"
  period              = 300
  evaluation_periods  = 1
  threshold           = 1
  comparison_operator = "GreaterThanOrEqualToThreshold"
  treat_missing_data  = "notBreaching"
  alarm_actions       = [aws_sns_topic.security_alerts.arn]
}
```

Detecting security-group changes is one of the 3 that AWS officially lists explicitly as CloudTrail × CloudWatch metric-filter examples. What the official illustrates is the 3 of (1) security-group changes, (2) console sign-in failures, and (3) IAM policy changes. Note that standard rules like "detect root-account use" originate from the CIS AWS Foundations Benchmark and have a different origin from AWS's basic tutorial here (don't confuse them and write "the official illustrates 4").

That is, with the same mechanism, count events where `ConsoleLogin`'s `errorMessage` is "Failed authentication" for brute-force login attempts, and count IAM's `PutGroupPolicy` / `PutUserPolicy` / `AttachRolePolicy`, etc. for privilege changes, to alert on each. All three are the same-type pattern of "count when a specific management event flows into the log group → ring at a threshold," and make one SG example and you can extend the rest horizontally.

And what takes effect when "it's not enough to ring an alert, but I want to investigate later" is CloudWatch Logs Insights. Against the log group you flowed CloudTrail into, you can list "recent SG changes, with who did it" with a query like this.

```sql
fields @timestamp, userIdentity.arn, eventName, requestParameters.groupId
| filter eventSource = "ec2.amazonaws.com"
| filter eventName like /SecurityGroup/
| sort @timestamp desc
| limit 50
```

This is CloudWatch's true value — more flexibly than CloudTrail's standalone event-history console, you can aggregate and analyze logs with multiple query languages including SQL/PPL (the official explains Logs Insights as "interactive, fast queries on your log data"). If CloudTrail "produces evidence," CloudWatch Logs Insights is the tool to "process the evidence."

### 4-2. CloudTrail → EventBridge (lean more toward event-driven)

A metric filter is "aggregate then judge by threshold," so it has, at minimum, a delay of the aggregation cycle. If you want "to immediately trigger on a specific API," an EventBridge rule is more straightforward.

```json
{
  "source": ["aws.ec2"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventSource": ["ec2.amazonaws.com"],
    "eventName": ["AuthorizeSecurityGroupIngress"]
  }
}
```

Place SNS or Lambda on this rule's target, and you can do "Slack notification the moment an SG opens" or "immediately fire a Lambda that auto-revokes that rule." It's good to position EventBridge as the piece within the CloudWatch family that handles "event-driven detection and auto-response."

### 4-3. Config Rule + auto-remediation

The last detection option is Config's auto-remediation. Config Rules can not only judge "noncompliant" but also, combined with SSM Automation, **auto-remediate** (the official Remediation feature). Operations like auto-detaching the rule when an SG opens to `0.0.0.0/0:22` are possible.

Let me organize the 3 detection paths.

| Path | Reaction speed | Suited use |
| --- | --- | --- |
| CloudTrail → CloudWatch Logs metric filter + alarm | A delay of the aggregation cycle | Count-based alerts, dashboarding, when you want to integrate with existing CloudWatch operation |
| CloudTrail → EventBridge rule | Fast (event-driven) | When you want to immediately trigger a notification/auto-response on a specific API |
| AWS Config Rule + remediation | Depends on the evaluation timing | When you want to capture it as a "compliant/noncompliant" state, auto-remediate, or use it for compliance reporting |

There's no single correct answer to "which to use for detection." **If you want immediacy, EventBridge; if you want to speak in state and compliance, Config; if you want to ride existing CloudWatch dashboard/alarm operation, the metric filter.** In a case like a payment platform where "you want to notice suspicious operations in seconds, and also prove the configuration's compliance state later," using both immediate detection with EventBridge and compliance evaluation with Config was the standard play.

### 4-4. Fix the "assignment" as design

What took effect in practice was fixing the three's roles not as a verbal understanding but as a **division-of-labor table in the design document.** When I designed the reliability layer of the serverless payment platform too, I decided, one by one, "which service is the master of record" for the concerns of monitoring, audit, and compliance. For example, an assignment like this.

- Execution logs, error rate, execution time of the payment API (Lambda) → **CloudWatch** (metrics + Logs. "Who" isn't needed here; what's needed is the behavior)
- IAM role changes, creation/deletion of production resources, no-MFA operations → **CloudTrail** (who, when. Immediate alerts with EventBridge)
- Compliance state of encryption-required S3/DynamoDB, SGs that must not be public → **AWS Config** (continuous evaluation with Config Rules, notify on noncompliant)

Assign it this way, and when a new monitoring requirement comes, the place is uniquely decided just by asking "is this a behavior / an operation / a configuration." Conversely, do "everything in CloudWatch for now" while leaving the assignment ambiguous, and you end up forcibly substituting operation logs and configuration history with CloudWatch's log search — investigation is slow and billing balloons wastefully. The division of labor is worth writing on paper at the very start of the design.

## 5. The billing models of the three and the usage-distinction judgment

Why is "everything on" dangerous? It's because the three's billing axes are completely different. Get this wrong and you keep paying money on a place with thin effect. Grasp accurately each service's **billing axis (dimension)** (since the concrete unit prices change by region and time, always check the latest on each Pricing page. This article shows not the unit price but "what you're billed for").

### CloudTrail's billing axis

Management events are **the first copy's delivery free** in each region. Use the same event by duplicating delivery to multiple trails, or use data events (object-level on S3, Lambda execution, and other high-frequency ones) or CloudTrail Lake, and billing occurs. Further, the storage fee of the delivery-destination S3 costs separately. In short, "just taking management events with one trail" is nearly free, and taking data events broadly takes effect at once.

> CloudTrail's cost optimization (how to narrow data events, avoiding the duplicate billing of multiple trails, where to use Lake) is dug into in the sister article [CloudTrail Pricing and Cost-Optimization Guide](/blog/aws-cloudtrail-pricing-cost-optimization-guide). This article stays at a comparison of the three's billing axes.

### CloudWatch's billing axis

CloudWatch's billing axis is divided per monitoring feature ([Amazon CloudWatch Pricing](https://aws.amazon.com/cloudwatch/pricing/)).

- **Metrics**: the number of custom metrics (per month), API requests like `PutMetricData`/`GetMetricData`
- **Logs**: the 3 axes of ingestion volume (GB) / storage volume (GB·month) / the data volume scanned by Logs Insights queries (GB)
- **Alarms**: the number of alarm metrics (per month). High-resolution alarms are billed extra
- **Dashboards**: the number of custom dashboards (auto dashboards are free)

There's also a free tier (basic metrics, 10 custom metrics, 3 dashboards, 10 alarm metrics, 5GB logs, etc.). A typical case where cost balloons in CloudWatch is "log ingestion volume." Flow all of verbose app logs in, and the ingestion billing takes effect, so the design of log level and retention period takes effect. Note that, as seen in §4, in a configuration flowing CloudTrail into CloudWatch Logs, that CloudTrail log too becomes subject to CloudWatch-side ingestion/storage billing. If the purpose is "to store audit logs cheaply for the long term," from CloudTrail to S3; if "to search and alert on the recent," to CloudWatch Logs — separating delivery destinations by purpose makes waste less likely.

### AWS Config's billing axis

AWS Config is billed on 3 axes ([AWS Config Pricing](https://aws.amazon.com/config/pricing/)).

- **The number of recorded configuration items**: a configuration item is recorded each time a resource changes, billed by that count
- **The number of Config Rule evaluations**: billed each time a rule evaluates a resource
- **The number of conformance pack evaluations**: billed by the number of evaluations by rules in the pack

In addition, the S3 for storing configuration history, the SNS for notifications, and the Lambda for custom rules cost the standard fee separately. A typical case where cost balloons in Config is "including resources that change frequently in the recording target" and "having many rules." The recording-target resource types can be narrowed, so the iron rule is to limit to those important for governance.

### The billing axes on one sheet

| Service | Main billing axis | Typical case where cost balloons | The free/low-cost usable range |
| --- | --- | --- | --- |
| CloudTrail | Management events (the first copy free) / data events / Lake / delivery-destination S3 | Taking high-frequency data events broadly | Taking management events with a single trail (effectively nearly free, only the S3 fee) |
| CloudWatch | The number of custom metrics / log ingestion, storage, query / the number of alarms / the number of dashboards | Mass ingestion of verbose logs | The free tier (basic metrics, a few alarms/dashboards, 5GB logs) |
| AWS Config | The number of recorded configuration items / the number of rule evaluations / the number of conformance pack evaluations | Recording frequently-changing resources, a large number of rules | Limit the recording target to important resources, narrow the rules |

The judgment guideline is simple. **Leaving "who did what" is nearly free with management-event CloudTrail, so this is on by principle.** Narrow CloudWatch to "the metrics you want to measure / the logs you truly search." Narrow Config to "the resources whose compliance you need to prove." Rather than turning everything on at maximum granularity, carving only the range needed per purpose (audit, observability, compliance) is the only way to achieve both cost and usefulness.

## 5-2. Instant answers to common questions

Let me briefly answer questions frequent in search with the content so far.

**Q. Is it enough for monitoring to just put in CloudTrail?**
Not enough. CloudTrail is the audit log of "who did what API" and doesn't measure performance like CPU, latency, error rate, or app logs. For the visualization of performance and operation, CloudWatch is needed. Resolve to treat CloudTrail as a service of "audit," not "monitoring."

**Q. If I flow CloudTrail into CloudWatch's logs, isn't Config unneeded too?**
Different things. CloudTrail flowing into CloudWatch is "a record of operations," and you can't tell "what configuration a resource is in now and is it compliant." Compliance evaluation, configuration history, and relationships between resources are Config's inherent features and can't be substituted by CloudWatch.

**Q. Does AWS Config also tell me "who changed it"?**
It doesn't. Config leaves "what changed how" as a configuration item but doesn't hold the operation subject (who). For "who," you need to cross-check CloudTrail's `userIdentity`. Only by using both as a set does "who changed into what configuration" come together.

**Q. Cost-wise, what should I put in first, at minimum?**
Management-event CloudTrail (with 1 trail per region, effectively nearly free, only the S3 fee), and CloudWatch narrowed to the metrics/alarms truly needed. Config is more cost-effective to put in, with the target resources narrowed, once the requirement of "needing to prove compliance" arises.

**Q. CloudTrail's event history is viewable for 90 days, so why is a Trail needed?**
Event history is "management events only, 90 days, a free viewing window" and is not permanent storage. For storage beyond 90 days, recording data events, continuous delivery to S3 or integrity verification, and CloudWatch Logs/EventBridge integration, a Trail is the premise. For details, see the pillar article.

## 6. Summary: a use-case cheat sheet

Finally, let me leave one sheet to open when you're lost in the field. The trick is to reverse-look-up from the question.

| What you want to do | The first choice | Note |
| --- | --- | --- |
| Know "who" changed/deleted it (audit) | CloudTrail | Identify the subject with userIdentity. Management events on by principle |
| Long-term store/analyze the operation trail | CloudTrail (Trail → S3 / Lake) | 90-day event history is a viewing window. Storage presumes a Trail |
| Measure performance, latency, error rate | CloudWatch (metrics) | CloudTrail/Config don't measure performance |
| Search/aggregate app/system logs | CloudWatch Logs (Logs Insights) | Operation logs are CloudTrail, configuration history is Config |
| Notify/auto-respond on a threshold breach | CloudWatch alarms | Suited to count-based detection |
| Immediately trigger a response on a specific API | EventBridge | With CloudTrail as input, low-latency auto-response |
| Know what configuration a resource is in now | AWS Config | A snapshot of the configuration item |
| Follow when and how a setting changed | AWS Config (configuration history timeline) | The state transition against CloudTrail's "point" |
| Judge compliance with company rules / compliance | AWS Config Rules / conformance packs | Auto-remediation of noncompliant is also possible |
| Grasp resource dependencies | AWS Config (relationships) | A map of SG↔EC2↔EBS, etc. |
| Trace a request in a distributed system | AWS X-Ray | App observability is a separate piece |
| See network flows | VPC Flow Logs | L3/L4 flow recording |
| Detect threats / suspicious behavior | Amazon GuardDuty | Threat detection with CloudTrail etc. as input |

At the very last, compressed into 3 lines, it's this.

- **Audit, "who" → CloudTrail**
- **Performance, logs, alarms → CloudWatch**
- **Configuration compliance, drift → AWS Config**

A design that correctly divides these 3 doesn't get lost in "where should I look" on the night of an incident. The culprit with CloudTrail, the damage situation and compliance with Config, the noticing with CloudWatch/EventBridge — with each one's strength, you can build the case for the accident three-dimensionally. If you want to design the whole monitoring including app-layer observability (distributed tracing, OpenTelemetry, the SRE perspective), reading [Observability & SRE Practice with OpenTelemetry on ECS](/blog/aws-observability-opentelemetry-sre-ecs) together connects the control-plane audit and the app-layer observability into one line. If you design network-boundary defense in depth, [The Defense-in-Depth Guide with AWS WAF and Cloud Armor (OWASP)](/blog/waf-defense-in-depth-aws-waf-cloud-armor-owasp-guide) is also a reference.

---

The design of "assigning audit, observability, and configuration compliance correctly to the right service" is plain but effective. The speed of incident response, the effort of audit response, and the monthly cloud cost are all decided by the quality of this division-of-labor design. While I designed and led the reliability layer of a serverless payment platform and maintained 0 double charges in production, I worked out exactly this "where to record what, and where to detect it," one by one.

If, in your own AWS environment, "I want to take inventory of the design of audit logs, observability, and governance (configuration compliance) once" or "everything is on and only the cost is ballooning, and I feel like the record I need isn't there when it matters" — if you have any of these in mind, please consult via [Contact](/contact). I'll organize the three's roles and design, together, an audit/observability architecture with no excess or deficiency against the purpose, and that's worth the cost.
