Skip to main content
友田 陽大
AWS CloudTrail audit & governance
AWS
CloudTrail
CloudWatch
AWS Config
可観測性
アーキテクチャ設計

The Difference Between AWS CloudTrail, CloudWatch, and AWS Config and How to Use Them (2026 Edition): Recording Who, What, and How It's Running with the Right Service

An explanation faithful to the official documentation of the difference in roles among CloudTrail (who called what API = audit), CloudWatch (metrics/logs/alarms = performance and operation), and AWS Config (resource configuration and compliance state), the common misconceptions, how to combine them, the billing models of all three, and implementation examples. We make the usage distinction sink in with a real example of following one change across all three.

Published
Reading time
27 min read
Author
友田 陽大
Share

A production security group was opened to 0.0.0.0/0 by someone's hand during the night. When you notice it the next morning, which service's console do you open first?

If you want to know "who did it," CloudTrail. If you want to know "what's the setting now, and does it violate company rules," AWS Config. If you're thinking "how to notice the moment it was changed in the first place," CloudWatch (and EventBridge). The same single accident leaves evidence from a different angle in each of the three services.

Do "just turn it all on" while confusing these, and you fall into the worst pattern of cost ballooning yet, when it matters, the record you need being nowhere. Conversely, assign the roles correctly, and audit, observability, and configuration compliance divide labor cleanly, and both investigation and remediation become fast.

I designed and led the reliability layer of a serverless payment platform (Lambda + DynamoDB) and maintained 0 double charges in production. From that experience, I can assert that the division-of-labor design of "which service to have record what" is exactly what decides the speed of incident response and governance. This article distills the difference among these three and how to use them, faithful to the official documentation, down to a level where you won't be lost in the field.

The overall picture of CloudTrail itself (Terraform setup, event types, how to read JSON records, integrity verification, security best practices) is consolidated in the pillar article CloudTrail Audit Logging, Governance, and Security Complete Guide. This article specializes in "the usage distinction among the three," so refer there for the basic setup.

0. First, fix the three in one line each (mental model)

90% of confusion comes from entering design while leaving these 3 lines ambiguous. Let me memorize them first.

  • CloudTrail = who, when, from where, called what API (the audit log of account activity)
  • CloudWatch = how the system is running now (monitoring / observability via metrics, logs, alarms, dashboards)
  • AWS Config = what configuration a resource is in now, how it changed in the past, and whether it's compliant (snapshots and history of configuration items, rule evaluation)

The official primary definitions are this.

  • AWS Config: "AWS Config provides a detailed view of the configuration of AWS resources in your AWS account. This includes how the resources are related to one another and how they were configured in the past so that you can see how the configurations and relationships change over time." (What Is AWS Config?)
  • Amazon CloudWatch: "Amazon CloudWatch monitors your Amazon Web Services (AWS) resources and the applications you run on AWS in real time, and offers many tools to give you system-wide observability of your application performance, operational health, and resource utilization." (What is Amazon CloudWatch?)
  • AWS CloudTrail: records API/non-API activity on the account as "events," used for operational audit, risk audit, governance, and compliance. Who, when, from where, called which API remains (What Is AWS CloudTrail?).

An instant-answer table to "which monitoring service should I use?"

Let me show, on one sheet, the correspondence of what you want to know → the service to open.

The question you want to answerThe service to use
Who changed/deleted this resourceCloudTrail
When and from where (IP/region) was it operatedCloudTrail
Did this API call succeed, or was it deniedCloudTrail
How are CPU/memory/latency/error rate trendingCloudWatch (metrics)
I want to search/aggregate app or service logsCloudWatch Logs (Logs Insights)
I want to notify / auto-respond when a threshold is exceededCloudWatch alarms
What configuration is this resource in nowAWS Config
How did this setting change compared to 1 month agoAWS Config (configuration history timeline)
Is it compliant with company rules / complianceAWS Config Rules / conformance packs
I want to know the dependencies between resources (SG↔EC2↔EBS)AWS Config (relationships)
I want to trace a request in a distributed systemAWS X-Ray
I want to see the VPC's network flowsVPC Flow Logs
I want to detect threats / suspicious behaviorAmazon GuardDuty (takes CloudTrail etc. as input)

If this table sinks in, you can say you've understood 80% of this article. The rest is just working out "why they get confused," "how to combine them," and "cost."

The adjacent services that appear lower in the table also complete the map when positioned in relation to the three. X-Ray is distributed tracing that follows requests within an application end-to-end, a piece that digs CloudWatch's observability into the app layer. VPC Flow Logs records L3/L4 network flows (which IP communicated to which port), a layer separate from CloudTrail's "API operations." GuardDuty is a threat-detection service that takes CloudTrail's logs, VPC Flow Logs, and DNS logs as input to find suspicious behavior — that is, GuardDuty is analysis riding on top of CloudTrail, not a replacement for CloudTrail. Hold this sense of layers, and you can think in terms of "which layer to thicken" rather than "which to put in."

Abstracting one more level, what CloudTrail looks at is mainly the control plane (management operations of creating, changing, and deleting resources; enable data events and data-plane access too), what CloudWatch looks at is the runtime behavior of the app/infra, and what Config looks at is the static state called the resource's configuration. "Operation," "behavior," "state" — memorizing the three with these 3 words is the hardest to forget.

1. The difference in roles in one table

Line up the three on the same axes, and it's clear they're complementary, not competing.

ViewpointCloudTrailCloudWatchAWS Config
The true nature in one phraseThe audit log of account activityA monitoring/observability platformThe recording and evaluation of resource configuration
The question it answersWho did what APIHow the system is runningWhat configuration it's in now and is it compliant
The unit of recordingAn event (API/non-API activity)Metrics, log eventsA configuration item
The time axisA record of the moment the operation happenedA continuous time series, log streamsSnapshot history per change
Main usersSecurity, audit, governanceSRE, development, operationsGovernance, compliance, operations
Typical query target"Who opened this SG""What caused the latency to rise""Is this SG a rule violation, and when did it change"
Does it measure performance (CPU/latency)Doesn't measureMeasures (a core feature)Doesn't measure
Does it hold the "subject (who)" of a changeHolds (userIdentity)Basically doesn't holdHolds the configuration but not "who"
Default retention/deliveryEvent history is 90 days, management events only, free. Continuous storage to S3 via a TrailMetrics/logs are billed-stored for the configured retention periodDelivers configuration history to S3, retains the timeline
Compliance evaluation featureNone (devotes itself to recording)None (devotes itself to monitoring)Has it (Config Rules / conformance packs)

The point is that all three use the words "audit" and "governance" in their own context. CloudTrail's audit is "the audit of operations," Config's audit is "the audit of configuration," and CloudWatch is not audit but "monitoring." The same word points to different things, so distinguishing by "the unit of recording" rather than the word is reliable. CloudTrail is events, CloudWatch is metrics/logs, Config is configuration items. Memorize just this and you mostly won't miss.

The difference between CloudTrail's event history (free, 90 days, management events only, not permanent) and continuous delivery to S3 via a Trail is detailed in the pillar article. In this article, it's enough to grasp just "event history is a short-term, free viewing window; continuous storage and analysis presume a Trail."

Grasp the outline by "what it doesn't do"

When confused by feature overlap, confirming what each service doesn't do sharpens the outline. It's a reverse lookup for being able to say "that's not that service's job" in a design review.

ServiceWhat it doesn't do (don't expect this of it)
CloudTrailMeasuring performance, latency, error rate. Listing a resource's "current configuration." Evaluating compliant/noncompliant. Collecting an app's standard-output logs
CloudWatchIdentifying "who operated" (it doesn't hold subject info). Managing resource-configuration history. Compliance evaluation
AWS ConfigIdentifying "who made that change" (the operation subject is on the CloudTrail side). Measuring performance metrics. Storing app logs

For example, "I put in Config, so I'll know who changed it" is a misconception. Config leaves "what changed" on a configuration item but doesn't hold the operation subject (who). To know "who," you need CloudTrail's userIdentity. Conversely, "I put in CloudTrail, so I can take inventory of how many holes are open in the account now" is also a misconception — that's Config's job, which can do cross-cutting configuration queries. Being conscious of this "boundary of the information held" naturally switches you to the mindset of using the three additively.

2. Why the three get confused

There are 2 clear reasons they still get mixed even when separated logically. Verbalize these, and discussion within the team meshes at once.

Reason 1: the word "log" points to 3 different things

When someone says "I want to see the logs," it can point to at least 3 different things.

  • Operation logs (who called what API) → CloudTrail
  • App/system logs (standard output, access logs, Lambda execution logs) → CloudWatch Logs
  • Configuration-change logs (how a resource's settings changed) → AWS Config's configuration history

The moment "log" becomes the subject, the conversation breaks apart in mid-air, because each speaker is picturing one of these 3. Fix it with concrete examples and it becomes this.

What "log" points toConcrete exampleThe correct place
Operation log"Who called DeleteBucket"CloudTrail
App/system log"Lambda spat a NullPointerException"CloudWatch Logs
Configuration-change log"When did this RDS's encryption setting change"AWS Config configuration history

In a design meeting, always distinguish by saying "operation log / app log / configuration history." Just this eliminates the gap in perception and dramatically reduces the back-and-forth of "where should I look."

Reason 2: you can flow CloudTrail into CloudWatch Logs

This is the biggest source of confusion. A CloudTrail Trail can choose, as a delivery destination, not only S3 but also a CloudWatch Logs log group. That is, the situation of "CloudTrail's logs appear on the CloudWatch screen" normally happens. Here arises the misconception of "then aren't CloudTrail and CloudWatch the same thing?"

The correct understanding is this. CloudTrail is the producer of the record, and CloudWatch Logs is merely one of its delivery destinations / analysis platforms. Flowing CloudTrail into CloudWatch Logs is for "immediately notifying when a specific operation happens" with metric filters and alarms — it's not that CloudWatch doubles as CloudTrail's role. The two are "separate services that have a contact point." The implementation of this contact point is shown concretely in §4.

Note: CloudWatch's old "CloudWatch Events" is now provided as Amazon EventBridge. When this article touches on event-driven detection, it refers to EventBridge.

Reason 3: all three are introduced as "usable for security"

Read the official documentation and CloudTrail, Config, and CloudWatch are each written as "useful for security analysis." Config's official page also has a "Security Analysis" section, explicitly stating you can retroactively investigate past IAM policies and SG port-opening states. This "they all work for security" explanation generates the question "then what's the difference?" for a beginner.

To organize, even with the same security purpose, the angle differs. CloudTrail is "was there an illegitimate operation," Config is "is it in a dangerous configuration / what configuration was it in the past," and CloudWatch is "make it possible to notice an anomaly." The three are climbing the mountain called security from different trailheads; they share the summit (safe operation). So "which to put in for security" is not an either-or but, correctly, layering per purpose layer.

3. A real example of following one change across the three

So much for abstraction. Let me concretely see how the three each record the accident from the introduction — "someone opened a production security group to 0.0.0.0/0." This is the core of this article.

Suppose the operation in question was this Terraform diff (or someone opened it by hand from the console — that's fine too. From CloudTrail's viewpoint, it's the same AuthorizeSecurityGroupIngress API).

resource "aws_security_group_rule" "ssh_from_anywhere" {
  type              = "ingress"
  from_port         = 22
  to_port           = 22
  protocol          = "tcp"
  cidr_blocks       = ["0.0.0.0/0"] # 本来は社内CIDRのみのはずだった
  security_group_id = aws_security_group.app.id
}

3-1. CloudTrail: who, when, from where (AuthorizeSecurityGroupIngress)

This operation remains in CloudTrail as the EC2 management event AuthorizeSecurityGroupIngress. Look at userIdentity and "who" can be uniquely identified.

{
  "eventVersion": "1.09",
  "userIdentity": {
    "type": "AssumedRole",
    "principalId": "AROAEXAMPLEID:alice",
    "arn": "arn:aws:sts::123456789012:assumed-role/Developer/alice",
    "accountId": "123456789012",
    "sessionContext": {
      "attributes": { "mfaAuthenticated": "false" }
    }
  },
  "eventTime": "2026-06-26T18:42:11Z",
  "eventSource": "ec2.amazonaws.com",
  "eventName": "AuthorizeSecurityGroupIngress",
  "awsRegion": "ap-northeast-1",
  "sourceIPAddress": "203.0.113.42",
  "requestParameters": {
    "groupId": "sg-0abc123def4567890",
    "ipPermissions": {
      "items": [
        {
          "ipProtocol": "tcp",
          "fromPort": 22,
          "toPort": 22,
          "ipRanges": { "items": [ { "cidrIp": "0.0.0.0/0" } ] }
        }
      ]
    }
  },
  "responseElements": { "_return": true }
}

What you can read from here is the fact that alice, who assumed the Developer role, fully opened the SSH port from IP 203.0.113.42 at 2026-06-26 18:42 UTC, and moreover without MFA. userIdentity.type is AssumedRole (mind the spelling. Root / IAMUser / AssumedRole / Role / FederatedUser / AWSService / IdentityCenterUser are the correct values, and a value IAMIdentityCenter doesn't exist).

What CloudTrail answers is this far — who, when, from where, which API. "So, what state is the SG in now?" "Is that a rule violation?" CloudTrail doesn't answer. That's Config's province.

3-2. AWS Config: what configuration it became, and is it a violation (configuration item + rule evaluation)

AWS Config records the same change as a "configuration item." Config's configuration item is officially defined as "a point-in-time view of the various attributes of a supported AWS resource," including metadata, attributes, relationships, the current configuration, and related events (AWS Config terminology and concepts).

That is, Config retains the configuration itself of "this SG is now opening port 22 to 0.0.0.0/0," and further, from the diff with the past configuration item, when it changed to that configuration can be followed on a timeline. Where CloudTrail records "the point of the operation," it's good to grasp that Config records "the state of the configuration and the transition of that state."

And here's Config's inherent value. If you have Config Rules in effect, this configuration is automatically evaluated as "compliant / noncompliant." A representative managed rule that rejects full SSH opening is restricted-ssh. Configure it in Terraform like this.

resource "aws_config_config_rule" "restricted_ssh" {
  name = "restricted-ssh"

  source {
    owner             = "AWS"
    source_identifier = "INCOMING_SSH_DISABLED" # restricted-ssh の識別子
  }

  # 評価対象を SG に限定
  scope {
    compliance_resource_types = ["AWS::EC2::SecurityGroup"]
  }

  depends_on = [aws_config_configuration_recorder.main]
}

In this state, when someone opens an SG to 0.0.0.0/0:22, Config immediately marks that SG NON_COMPLIANT. It's the behavior of the official words "When AWS Config detects that a resource violates the conditions in one of your rules, AWS Config flags the resource as noncompliant and sends a notification."

When you want to take inventory of "how many noncompliant SGs are there in my account now," you can query cross-cuttingly with Config's Advanced Query (a SQL-like syntax).

SELECT
  resourceId,
  resourceName,
  configuration.ipPermissions
WHERE
  resourceType = 'AWS::EC2::SecurityGroup'
  AND configuration.ipPermissions.ipRanges = '0.0.0.0/0'

CloudTrail, even though you can tell "who," can't produce "how many holes are open across the whole account now." This is Config's sole domain.

Further, Config records the relationships between resources. As the official defines, "AWS Config discovers AWS resources in your account and then creates a map of relationships between AWS resources," and you can trace, for example, "which EC2 instances is this SG attached to." That is, you can grasp at once even the blast radius of "which production instances the holed SG is actually exposing." Information that never comes out of a single CloudTrail event.

And open the configuration-history timeline, and you can see the state transition in time series of "this SG was only internal CIDR until yesterday, but at 18:42 it changed to a configuration including 0.0.0.0/0." CloudTrail's "point of the operation" and Config's "state of the configuration and its transition" become a complete story only when cross-checked like this — who (CloudTrail), at what time, into what configuration (Config's timeline), exposed which instances (Config's relationships).

3-3. CloudWatch / EventBridge: how to notice in that moment

CloudTrail and Config are "evidence you can follow later," but what truly takes effect in incident response is "noticing the moment it was changed." Here CloudWatch (and EventBridge) appears.

There are 2 paths.

  1. CloudTrail → CloudWatch Logs → metric filter → alarm: flow CloudTrail into CloudWatch Logs, create a metric filter that increments a count when it matches the pattern AuthorizeSecurityGroupIngress, and notify with an alarm when it becomes greater than 0.
  2. EventBridge rule: with CloudTrail's API call as the event source, directly fire a notification or a Lambda when it matches a specific API name.

The implementation of the CloudWatch path is shown in complete form in §4. What to grasp here is the division of labor. CloudWatch handles "noticing (detection, notification, visualization)," and CloudTrail/Config handle "leaving evidence." For the same accident, Config statically shows it as "noncompliant," and CloudWatch dynamically shouts "it just happened."

To summarize so far, against the one accident of an SG opening, the three divide labor like this.

ServiceWhat it leaves / does against this accident
CloudTrailWho (alice/Developer role, no MFA), when, from which IP, with which API opened it
AWS ConfigThe SG's new configuration, the noncompliant judgment by the restricted-ssh rule, the timeline of when it became that configuration
CloudWatch / EventBridgeThe metric-filter firing and alarm notification at the moment it was opened, the trigger for auto-response

Try to substitute the three with "just one," and somewhere in this table a column always goes blank. That's why they're complementary.

Connect this accident into one investigation line, and it reads like this. CloudWatch (or EventBridge) shouts "the SG opened at 18:42," CloudTrail points to the culprit "it was alice of the Developer role, without MFA, from 203.0.113.42," and Config tells the damage and the course "that SG is now noncompliant with restricted-ssh, exposing 3 production instances, and was not in this configuration until yesterday." Only when the three each bring out their strength does the full text of the accident — "when, who, what, into what configuration, with how much impact" — come together. In a design leaning on one service, somewhere in this sentence always has a gap.

4. Combination design (where to place detection)

Once you understand the roles, next is implementation. Here I show the design of "detecting a change" with the most frequent pattern of connecting CloudWatch starting from CloudTrail.

4-1. CloudTrail → CloudWatch Logs metric filter + alarm

First, deliver the Trail to a CloudWatch Logs log group (assuming you separately have S3 delivery as a premise. For the basic setup, see the pillar article). On top of that, attach a metric filter and an alarm.

# 1) CloudTrail のログを受けるロググループ
resource "aws_cloudwatch_log_group" "trail" {
  name              = "/aws/cloudtrail/security-events"
  retention_in_days = 365
}

# 2) SG 変更を数えるメトリクスフィルタ
#    AuthorizeSecurityGroupIngress / Egress / RevokeSecurityGroup* を拾う
resource "aws_cloudwatch_log_metric_filter" "sg_changes" {
  name           = "security-group-changes"
  log_group_name = aws_cloudwatch_log_group.trail.name

  pattern = <<-PATTERN
    { ($.eventName = AuthorizeSecurityGroupIngress) ||
      ($.eventName = AuthorizeSecurityGroupEgress) ||
      ($.eventName = RevokeSecurityGroupIngress) ||
      ($.eventName = RevokeSecurityGroupEgress) ||
      ($.eventName = CreateSecurityGroup) ||
      ($.eventName = DeleteSecurityGroup) }
  PATTERN

  metric_transformation {
    name          = "SecurityGroupEventCount"
    namespace     = "CloudTrailMetrics"
    value         = "1"
    default_value = "0"
  }
}

# 3) 1件でも起きたら通知するアラーム
resource "aws_cloudwatch_metric_alarm" "sg_changes" {
  alarm_name          = "security-group-changes"
  namespace           = "CloudTrailMetrics"
  metric_name         = "SecurityGroupEventCount"
  statistic           = "Sum"
  period              = 300
  evaluation_periods  = 1
  threshold           = 1
  comparison_operator = "GreaterThanOrEqualToThreshold"
  treat_missing_data  = "notBreaching"
  alarm_actions       = [aws_sns_topic.security_alerts.arn]
}

Detecting security-group changes is one of the 3 that AWS officially lists explicitly as CloudTrail × CloudWatch metric-filter examples. What the official illustrates is the 3 of (1) security-group changes, (2) console sign-in failures, and (3) IAM policy changes. Note that standard rules like "detect root-account use" originate from the CIS AWS Foundations Benchmark and have a different origin from AWS's basic tutorial here (don't confuse them and write "the official illustrates 4").

That is, with the same mechanism, count events where ConsoleLogin's errorMessage is "Failed authentication" for brute-force login attempts, and count IAM's PutGroupPolicy / PutUserPolicy / AttachRolePolicy, etc. for privilege changes, to alert on each. All three are the same-type pattern of "count when a specific management event flows into the log group → ring at a threshold," and make one SG example and you can extend the rest horizontally.

And what takes effect when "it's not enough to ring an alert, but I want to investigate later" is CloudWatch Logs Insights. Against the log group you flowed CloudTrail into, you can list "recent SG changes, with who did it" with a query like this.

fields @timestamp, userIdentity.arn, eventName, requestParameters.groupId
| filter eventSource = "ec2.amazonaws.com"
| filter eventName like /SecurityGroup/
| sort @timestamp desc
| limit 50

This is CloudWatch's true value — more flexibly than CloudTrail's standalone event-history console, you can aggregate and analyze logs with multiple query languages including SQL/PPL (the official explains Logs Insights as "interactive, fast queries on your log data"). If CloudTrail "produces evidence," CloudWatch Logs Insights is the tool to "process the evidence."

4-2. CloudTrail → EventBridge (lean more toward event-driven)

A metric filter is "aggregate then judge by threshold," so it has, at minimum, a delay of the aggregation cycle. If you want "to immediately trigger on a specific API," an EventBridge rule is more straightforward.

{
  "source": ["aws.ec2"],
  "detail-type": ["AWS API Call via CloudTrail"],
  "detail": {
    "eventSource": ["ec2.amazonaws.com"],
    "eventName": ["AuthorizeSecurityGroupIngress"]
  }
}

Place SNS or Lambda on this rule's target, and you can do "Slack notification the moment an SG opens" or "immediately fire a Lambda that auto-revokes that rule." It's good to position EventBridge as the piece within the CloudWatch family that handles "event-driven detection and auto-response."

4-3. Config Rule + auto-remediation

The last detection option is Config's auto-remediation. Config Rules can not only judge "noncompliant" but also, combined with SSM Automation, auto-remediate (the official Remediation feature). Operations like auto-detaching the rule when an SG opens to 0.0.0.0/0:22 are possible.

Let me organize the 3 detection paths.

PathReaction speedSuited use
CloudTrail → CloudWatch Logs metric filter + alarmA delay of the aggregation cycleCount-based alerts, dashboarding, when you want to integrate with existing CloudWatch operation
CloudTrail → EventBridge ruleFast (event-driven)When you want to immediately trigger a notification/auto-response on a specific API
AWS Config Rule + remediationDepends on the evaluation timingWhen you want to capture it as a "compliant/noncompliant" state, auto-remediate, or use it for compliance reporting

There's no single correct answer to "which to use for detection." If you want immediacy, EventBridge; if you want to speak in state and compliance, Config; if you want to ride existing CloudWatch dashboard/alarm operation, the metric filter. In a case like a payment platform where "you want to notice suspicious operations in seconds, and also prove the configuration's compliance state later," using both immediate detection with EventBridge and compliance evaluation with Config was the standard play.

4-4. Fix the "assignment" as design

What took effect in practice was fixing the three's roles not as a verbal understanding but as a division-of-labor table in the design document. When I designed the reliability layer of the serverless payment platform too, I decided, one by one, "which service is the master of record" for the concerns of monitoring, audit, and compliance. For example, an assignment like this.

  • Execution logs, error rate, execution time of the payment API (Lambda) → CloudWatch (metrics + Logs. "Who" isn't needed here; what's needed is the behavior)
  • IAM role changes, creation/deletion of production resources, no-MFA operations → CloudTrail (who, when. Immediate alerts with EventBridge)
  • Compliance state of encryption-required S3/DynamoDB, SGs that must not be public → AWS Config (continuous evaluation with Config Rules, notify on noncompliant)

Assign it this way, and when a new monitoring requirement comes, the place is uniquely decided just by asking "is this a behavior / an operation / a configuration." Conversely, do "everything in CloudWatch for now" while leaving the assignment ambiguous, and you end up forcibly substituting operation logs and configuration history with CloudWatch's log search — investigation is slow and billing balloons wastefully. The division of labor is worth writing on paper at the very start of the design.

5. The billing models of the three and the usage-distinction judgment

Why is "everything on" dangerous? It's because the three's billing axes are completely different. Get this wrong and you keep paying money on a place with thin effect. Grasp accurately each service's billing axis (dimension) (since the concrete unit prices change by region and time, always check the latest on each Pricing page. This article shows not the unit price but "what you're billed for").

CloudTrail's billing axis

Management events are the first copy's delivery free in each region. Use the same event by duplicating delivery to multiple trails, or use data events (object-level on S3, Lambda execution, and other high-frequency ones) or CloudTrail Lake, and billing occurs. Further, the storage fee of the delivery-destination S3 costs separately. In short, "just taking management events with one trail" is nearly free, and taking data events broadly takes effect at once.

CloudTrail's cost optimization (how to narrow data events, avoiding the duplicate billing of multiple trails, where to use Lake) is dug into in the sister article CloudTrail Pricing and Cost-Optimization Guide. This article stays at a comparison of the three's billing axes.

CloudWatch's billing axis

CloudWatch's billing axis is divided per monitoring feature (Amazon CloudWatch Pricing).

  • Metrics: the number of custom metrics (per month), API requests like PutMetricData/GetMetricData
  • Logs: the 3 axes of ingestion volume (GB) / storage volume (GB·month) / the data volume scanned by Logs Insights queries (GB)
  • Alarms: the number of alarm metrics (per month). High-resolution alarms are billed extra
  • Dashboards: the number of custom dashboards (auto dashboards are free)

There's also a free tier (basic metrics, 10 custom metrics, 3 dashboards, 10 alarm metrics, 5GB logs, etc.). A typical case where cost balloons in CloudWatch is "log ingestion volume." Flow all of verbose app logs in, and the ingestion billing takes effect, so the design of log level and retention period takes effect. Note that, as seen in §4, in a configuration flowing CloudTrail into CloudWatch Logs, that CloudTrail log too becomes subject to CloudWatch-side ingestion/storage billing. If the purpose is "to store audit logs cheaply for the long term," from CloudTrail to S3; if "to search and alert on the recent," to CloudWatch Logs — separating delivery destinations by purpose makes waste less likely.

AWS Config's billing axis

AWS Config is billed on 3 axes (AWS Config Pricing).

  • The number of recorded configuration items: a configuration item is recorded each time a resource changes, billed by that count
  • The number of Config Rule evaluations: billed each time a rule evaluates a resource
  • The number of conformance pack evaluations: billed by the number of evaluations by rules in the pack

In addition, the S3 for storing configuration history, the SNS for notifications, and the Lambda for custom rules cost the standard fee separately. A typical case where cost balloons in Config is "including resources that change frequently in the recording target" and "having many rules." The recording-target resource types can be narrowed, so the iron rule is to limit to those important for governance.

The billing axes on one sheet

ServiceMain billing axisTypical case where cost balloonsThe free/low-cost usable range
CloudTrailManagement events (the first copy free) / data events / Lake / delivery-destination S3Taking high-frequency data events broadlyTaking management events with a single trail (effectively nearly free, only the S3 fee)
CloudWatchThe number of custom metrics / log ingestion, storage, query / the number of alarms / the number of dashboardsMass ingestion of verbose logsThe free tier (basic metrics, a few alarms/dashboards, 5GB logs)
AWS ConfigThe number of recorded configuration items / the number of rule evaluations / the number of conformance pack evaluationsRecording frequently-changing resources, a large number of rulesLimit the recording target to important resources, narrow the rules

The judgment guideline is simple. Leaving "who did what" is nearly free with management-event CloudTrail, so this is on by principle. Narrow CloudWatch to "the metrics you want to measure / the logs you truly search." Narrow Config to "the resources whose compliance you need to prove." Rather than turning everything on at maximum granularity, carving only the range needed per purpose (audit, observability, compliance) is the only way to achieve both cost and usefulness.

5-2. Instant answers to common questions

Let me briefly answer questions frequent in search with the content so far.

Q. Is it enough for monitoring to just put in CloudTrail? Not enough. CloudTrail is the audit log of "who did what API" and doesn't measure performance like CPU, latency, error rate, or app logs. For the visualization of performance and operation, CloudWatch is needed. Resolve to treat CloudTrail as a service of "audit," not "monitoring."

Q. If I flow CloudTrail into CloudWatch's logs, isn't Config unneeded too? Different things. CloudTrail flowing into CloudWatch is "a record of operations," and you can't tell "what configuration a resource is in now and is it compliant." Compliance evaluation, configuration history, and relationships between resources are Config's inherent features and can't be substituted by CloudWatch.

Q. Does AWS Config also tell me "who changed it"? It doesn't. Config leaves "what changed how" as a configuration item but doesn't hold the operation subject (who). For "who," you need to cross-check CloudTrail's userIdentity. Only by using both as a set does "who changed into what configuration" come together.

Q. Cost-wise, what should I put in first, at minimum? Management-event CloudTrail (with 1 trail per region, effectively nearly free, only the S3 fee), and CloudWatch narrowed to the metrics/alarms truly needed. Config is more cost-effective to put in, with the target resources narrowed, once the requirement of "needing to prove compliance" arises.

Q. CloudTrail's event history is viewable for 90 days, so why is a Trail needed? Event history is "management events only, 90 days, a free viewing window" and is not permanent storage. For storage beyond 90 days, recording data events, continuous delivery to S3 or integrity verification, and CloudWatch Logs/EventBridge integration, a Trail is the premise. For details, see the pillar article.

6. Summary: a use-case cheat sheet

Finally, let me leave one sheet to open when you're lost in the field. The trick is to reverse-look-up from the question.

What you want to doThe first choiceNote
Know "who" changed/deleted it (audit)CloudTrailIdentify the subject with userIdentity. Management events on by principle
Long-term store/analyze the operation trailCloudTrail (Trail → S3 / Lake)90-day event history is a viewing window. Storage presumes a Trail
Measure performance, latency, error rateCloudWatch (metrics)CloudTrail/Config don't measure performance
Search/aggregate app/system logsCloudWatch Logs (Logs Insights)Operation logs are CloudTrail, configuration history is Config
Notify/auto-respond on a threshold breachCloudWatch alarmsSuited to count-based detection
Immediately trigger a response on a specific APIEventBridgeWith CloudTrail as input, low-latency auto-response
Know what configuration a resource is in nowAWS ConfigA snapshot of the configuration item
Follow when and how a setting changedAWS Config (configuration history timeline)The state transition against CloudTrail's "point"
Judge compliance with company rules / complianceAWS Config Rules / conformance packsAuto-remediation of noncompliant is also possible
Grasp resource dependenciesAWS Config (relationships)A map of SG↔EC2↔EBS, etc.
Trace a request in a distributed systemAWS X-RayApp observability is a separate piece
See network flowsVPC Flow LogsL3/L4 flow recording
Detect threats / suspicious behaviorAmazon GuardDutyThreat detection with CloudTrail etc. as input

At the very last, compressed into 3 lines, it's this.

  • Audit, "who" → CloudTrail
  • Performance, logs, alarms → CloudWatch
  • Configuration compliance, drift → AWS Config

A design that correctly divides these 3 doesn't get lost in "where should I look" on the night of an incident. The culprit with CloudTrail, the damage situation and compliance with Config, the noticing with CloudWatch/EventBridge — with each one's strength, you can build the case for the accident three-dimensionally. If you want to design the whole monitoring including app-layer observability (distributed tracing, OpenTelemetry, the SRE perspective), reading Observability & SRE Practice with OpenTelemetry on ECS together connects the control-plane audit and the app-layer observability into one line. If you design network-boundary defense in depth, The Defense-in-Depth Guide with AWS WAF and Cloud Armor (OWASP) is also a reference.


The design of "assigning audit, observability, and configuration compliance correctly to the right service" is plain but effective. The speed of incident response, the effort of audit response, and the monthly cloud cost are all decided by the quality of this division-of-labor design. While I designed and led the reliability layer of a serverless payment platform and maintained 0 double charges in production, I worked out exactly this "where to record what, and where to detect it," one by one.

If, in your own AWS environment, "I want to take inventory of the design of audit logs, observability, and governance (configuration compliance) once" or "everything is on and only the cost is ballooning, and I feel like the record I need isn't there when it matters" — if you have any of these in mind, please consult via Contact. I'll organize the three's roles and design, together, an audit/observability architecture with no excess or deficiency against the purpose, and that's worth the cost.

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading