# Safe Lambda deployment: versions, aliases, canary releases (CodeDeploy), and SAM/CDK/Terraform selection

> An implementation guide to safely deploying AWS Lambda with zero downtime. With real code faithful to the AWS official specs, it covers immutable versions and aliases, weighted aliases and CodeDeploy canary/linear delivery, pre/post-traffic hooks and automatic rollback via CloudWatch alarms, waiting on the function state, selecting SAM/CDK/Terraform/Serverless Framework, and keyless CI/CD via GitHub Actions OIDC.

- Published: 2026-06-27
- Author: 友田 陽大
- Tags: AWS, Lambda, CI/CD, サーバーレス, IaC
- URL: https://tomodahinata.com/en/blog/aws-lambda-deployment-versions-aliases-canary-sam-cdk-terraform-guide
- Category: AWS Lambda in production
- Pillar guide: https://tomodahinata.com/en/blog/aws-lambda-production-guide

## Key points

- A published version is an immutable snapshot of code + config; an alias is a pointer to it. $LATEST is mutable, so call production via an alias.
- Canary is realized with weighted aliases (weight distribution to up to 2 versions) + CodeDeploy. Flow it gradually with predefined settings like Canary10Percent5Minutes.
- Verify with pre/post-traffic hooks, and auto-rollback if a CloudWatch alarm fires. With SAM, it completes in a few lines of AutoPublishAlias + DeploymentPreference.
- IaC selection: SAM can compose safe deploy with minimal config / CDK has type safety and esbuild / Terraform for multi-cloud and existing assets / Serverless v4 is paid above $2M annual revenue.
- CI/CD is keyless with GitHub Actions OIDC (id-token:write + AssumeRoleWithWebIdentity). For updates, wait until LastUpdateStatus becomes Successful.

---

"I overwrite-deployed to production with `update-function-code`, and a defect hit all users at once" — because Lambda can be deployed in one command, you tend to do **shipping without a safety device.** For an API handling payment confirmation or user operations, a new version's defect **immediately hitting 100% of traffic** is unacceptable as production operation.

This article is an implementation guide to **safely deploying AWS Lambda with zero downtime and automatic rollback.** From the foundation of **versions and aliases**, it explains end-to-end through **canary releases**, **automatic rollback**, **IaC (SAM/CDK/Terraform) selection**, and **keyless CI/CD via OIDC.** As material, it also weaves in the shipping judgment from the [serverless payment platform](/case-studies/payment-platform-reliability) (**0 double charges in production**) that I built as a core developer. The Lambda execution model itself is left to the sister article [AWS Lambda production-operations guide](/blog/aws-lambda-production-guide); this article concentrates on the **single point of "how to ship safely."**

> **Rules for this article**: specs, parameter names, and predefined setting names are based on the **AWS official documentation (as of June 2026).** CodeDeploy setting names, runtimes, and each tool's specs are revised. Always confirm the latest values in the official docs (the "References" at the end) before production rollout.

---

## 0. Mental model: separate "the immutable version" from "the moving pointer"

All of safe deployment begins from separating these two.

- **Version = an immutable snapshot.** Publish and the **code and config** at that point becomes a numbered version with them fixed. The number monotonically increases and is **not reused** even if deleted/recreated.
- **Alias = a movable pointer to a version.** It points at a specific version by a name like `live` or `prod`. Deploying means **"safely switching the alias's target to the new version."**
- **`$LATEST` is mutable.** It's overwritten on each `update-function-code`. So **don't point production traffic directly at `$LATEST`** — always call via an alias.
- **Canary = don't move the pointer all at once; move it gradually by weight.** Flow only 10% to the new version, and if no problem, go to 100%. If there's a problem, automatically revert.

This "immutable foundation + moving pointer + gradual switch + automatic rollback" is this article's design.

---

## 1. Versions and aliases: call production via an alias

First, build the foundation. Update the code against `$LATEST`, **publish a version once stable**, and **point the alias at that version.** The client calls **the alias's ARN.**

```bash
# 1) コードを更新（$LATEST が変わる。本番はまだこれを見ていない）
aws lambda update-function-code --function-name orders --zip-file fileb://build.zip

# 2) 更新完了を待つ（重要。LastUpdateStatus=Successful になるまで次の操作は失敗する）
aws lambda wait function-updated-v2 --function-name orders

# 3) 不変バージョンを公開（番号が振られる。例: 42）
VERSION=$(aws lambda publish-version --function-name orders --query Version --output text)

# 4) エイリアス live をそのバージョンへ。クライアントは live を呼ぶ
aws lambda update-alias --function-name orders --name live --function-version "$VERSION"
```

Three official specs that bite here:

- **An alias is a "qualified ARN."** It carries a qualifier (version number or alias name) like `...:function:orders:live`. **Calling unqualified implicitly runs `$LATEST`** — a source of accidents in production.
- **Provisioned concurrency and SnapStart are enabled only on a published version/alias** (`$LATEST` not allowed). To make latency measures ([cold-start optimization](/blog/aws-lambda-cold-start-snapstart-provisioned-concurrency-performance-guide)) work, this foundation is a premise.
- **Not every config change publishes a version.** For example, **reserved concurrency doesn't create a version** (because it's a function-wide operational setting).

---

## 2. Canary release: flow "only 10%" with a weighted alias

An alias can **point at up to 2 published versions and distribute traffic by weight** (routing config / `AdditionalVersionWeights`). This is **the heart of a canary release.**

```bash
# エイリアス live：97%を現行、3%を新バージョン(43)へ。問題なければ重みを上げていく
aws lambda update-alias --function-name orders --name live \
  --function-version 42 \
  --routing-config 'AdditionalVersionWeights={"43"=0.03}'
```

Grasp the **constraints** the official imposes (not keeping them is an error or accident).

- **Both versions are published** (**`$LATEST` not allowed**).
- **Both versions' execution roles are identical.**
- **DLQ configuration is identical (or both absent).**
- They are 2 versions of the same function.

Raising and lowering weights by hand isn't realistic, so **automate by leaving it to CodeDeploy** (next chapter).

> **Compatibility with provisioned concurrency**: if you want to avoid a cold stack during canary, you can provision more concurrency just while routing is active (the official mentions this). For APIs with a latency SLA, combine canary and provisioned.

---

## 3. Canary with automatic rollback: CodeDeploy + SAM

CodeDeploy automatically moves the weighted alias's weight on a **predefined schedule** and **auto-rolls back if a CloudWatch alarm fires.** Manual weight adjustment becomes unnecessary.

### 3.1 Predefined deployment configurations (the official formal names)

| Kind | Setting name (prefixed with `CodeDeployDefault.`) | Behavior |
| --- | --- | --- |
| **Canary** | `LambdaCanary10Percent5Minutes` / `10Minutes` / `15Minutes` / `30Minutes` | flow 10%, then the remaining 90% all at once after the specified minutes |
| **Linear** | `LambdaLinear10PercentEvery1Minute` / `Every2Minutes` / `Every3Minutes` / `Every10Minutes` | increase 10% at a time, gradually |
| **All at once** | `LambdaAllAtOnce` | 100% in one go (no canary) |

> Name note: only the shortest linear is `Every1Minute` (singular), the rest are plural (`Every2Minutes`). In SAM's `DeploymentPreference.Type`, use the **shortened name** with the leading `CodeDeployDefault.Lambda` **removed** (e.g., `Canary10Percent10Minutes`).

### 3.2 With SAM, "verify + auto-rollback" composes in a few lines

Combine SAM's `AutoPublishAlias` (detects code changes and auto-publishes a version + updates the alias) and `DeploymentPreference` (canary strategy, alarms, hooks), and **safe deploy** can be written **declaratively.**

```yaml
# template.yaml（AWS SAM）：カナリア＋プリ/ポスト検証＋アラームで自動ロールバック
Resources:
  OrdersFunction:
    Type: AWS::Serverless::Function
    Properties:
      Handler: index.handler
      Runtime: nodejs22.x
      Architectures: [arm64]              # Arm64で実行料金20%減（互換があれば）
      AutoPublishAlias: live              # これが無いと DeploymentPreference は使えない
      DeploymentPreference:
        Type: Canary10Percent5Minutes     # 10%を5分流し、問題なければ残りを切替
        Alarms:                           # どれか1つでも ALARM になれば自動ロールバック
          - !Ref OrdersErrorsAlarm
          - !Ref OrdersLatencyP99Alarm
        Hooks:
          PreTraffic: !Ref PreTrafficCheck   # 切替前にスモークテスト
          PostTraffic: !Ref PostTrafficCheck # 切替後に結合検証

  # 新バージョンのエラー率を監視（鳴ったらロールバック）
  OrdersErrorsAlarm:
    Type: AWS::CloudWatch::Alarm
    Properties:
      Namespace: AWS/Lambda
      MetricName: Errors
      Dimensions:
        - { Name: FunctionName, Value: !Ref OrdersFunction }
        - { Name: Resource, Value: !Sub "${OrdersFunction}:live" }
      Statistic: Sum
      Period: 60
      EvaluationPeriods: 1
      Threshold: 1
      ComparisonOperator: GreaterThanOrEqualToThreshold
```

**Pre/post-traffic hooks** are verification Lambdas that CodeDeploy **calls before and after the traffic switch.** A hook **calls back** its result to CodeDeploy with `PutLifecycleEventHookExecutionStatus`, and **on failure the deployment is aborted and rolled back.** By convention, hook function names start with `CodeDeployHook_`.

```python
# プリトラフィックフック：切替前に新バージョンをスモークテストし、合否をCodeDeployへ返す
import boto3
codedeploy = boto3.client("codedeploy")

def handler(event, context):
    deployment_id = event["DeploymentId"]
    hook_id = event["LifecycleEventHookExecutionId"]
    status = "Succeeded"
    try:
        run_smoke_tests()   # 新バージョン（エイリアス未切替の version）を直接叩いて検証
    except Exception:
        status = "Failed"   # ここでFailedを返すと切替されずロールバックされる
    codedeploy.put_lifecycle_event_hook_execution_status(
        deploymentId=deployment_id, lifecycleEventHookExecutionId=hook_id, status=status,
    )
    return {"status": status}
```

> **The first deploy is two-stage**: CodeDeploy needs "the old version to switch from," so the first time deploys with only `AutoPublishAlias` → enable `DeploymentPreference` from the second time.

---

## 4. IaC selection: SAM / CDK / Terraform / Serverless Framework

"Which to manage Lambda with" is a buyer-intent-heavy question. Choose by **how easy safe deploy is to compose** and **the team's assets.**

| Tool | Safe deploy | Strength | Suited team |
| --- | --- | --- | --- |
| **AWS SAM** | ◎ a few lines of `DeploymentPreference` | serverless-specialized, canary with minimal config | all-in on AWS serverless, want to ship safely fastest |
| **AWS CDK** | ◎ `LambdaDeploymentGroup` | type-safe IaC, `NodejsFunction`'s esbuild bundle | want to compose types, completion, and complex structures in code |
| **Terraform** | ○ compose it yourself | multi-cloud, existing Terraform assets | already on Terraform, also manage non-AWS |
| **Serverless Framework** | ○ plugin | the ease of YAML | small-scale, quick. But **v4 has a license note** |

Key points:

- **SAM**: with `AutoPublishAlias` + `DeploymentPreference` (`Type`/`Alarms`/`Hooks`), **canary + auto-rollback at minimal cost.** Being a CloudFormation extension, the generated resources are also traceable.
- **CDK**: equivalent safe deploy with `aws-lambda`'s `Function` / `NodejsFunction` (auto-transpile/bundle with esbuild), and `aws-codedeploy`'s `LambdaDeploymentGroup` + `LambdaDeploymentConfig.CANARY_10PERCENT_5MINUTES`. Python/Go bundling is an alpha module.
- **Terraform**: **combine yourself** `aws_lambda_function` (`publish = true`) + `aws_lambda_alias` (`routing_config.additional_version_weights`) + `aws_codedeploy_app` (`compute_platform = "Lambda"`) / `aws_codedeploy_deployment_group`. Control is there but the wiring increases.
- **Serverless Framework**: `serverless.yml`'s ease is second to none, but **v4 requires a paid subscription for "individuals/organizations with over $2M revenue in the most recent fiscal year"** (v3 is free, OSS continuing). Depending on org size, factor in the cost.

```hcl
# Terraform：バージョン公開＋エイリアスの加重ルーティング（カナリアの土台）
resource "aws_lambda_function" "orders" {
  function_name = "orders"
  role          = aws_iam_role.orders.arn
  handler       = "index.handler"
  runtime       = "nodejs22.x"
  architectures = ["arm64"]
  filename      = "build.zip"
  publish       = true # 変更のたびに不変バージョンを公開
}

resource "aws_lambda_alias" "live" {
  name             = "live"
  function_name    = aws_lambda_function.orders.function_name
  function_version = aws_lambda_function.orders.version
  routing_config {
    additional_version_weights = { } # CodeDeploy/手動でカナリア時に重みを注入
  }
}
```

---

## 5. CI/CD: deploy "keyless" from GitHub Actions

**Putting long-lived AWS access keys in GitHub Secrets is a lump of leak risk.** The official correct answer is **OIDC (OpenID Connect)** — exchange the short-lived JWT GitHub issues for an AWS IAM role via `AssumeRoleWithWebIdentity` and **run on temporary credentials.** The stored keys become zero.

```yaml
# .github/workflows/deploy.yml：OIDCで鍵レスにSAMデプロイ（長期キーをSecretsに置かない）
name: deploy
on:
  push: { branches: [main] }
permissions:
  id-token: write   # OIDCトークンの発行に必須
  contents: read
jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v6   # 推奨：OIDCで一時クレデンシャル取得
        with:
          role-to-assume: arn:aws:iam::123456789012:role/github-deploy
          aws-region: ap-northeast-1
      - uses: aws-actions/setup-sam@v2
      - run: sam build
      - run: sam deploy --no-confirm-changeset --no-fail-on-empty-changeset
```

On the IAM side, in the trust policy allow the **issuer `token.actions.githubusercontent.com` and audience `sts.amazonaws.com`**, and narrow to **"only this branch of this repository" with the `sub` claim** (least privilege). The detailed design of OIDC is in the sister article [keyless CI/CD realized with OIDC](/blog/github-actions-oidc-keyless-cicd-aws-gcp-guide).

---

## 6. The zero-downtime pitfall: wait on the function state

Finally, a plain pitfall you always step on in production. **Lambda updates are asynchronous** and have a state machine.

| State | Meaning | Operability |
| --- | --- | --- |
| **Pending** | being created/configured (VPC ENI creation, etc.) | **can't invoke.** Updates also fail |
| **Active** | running | **the only state you can invoke** |
| **Inactive** | reclaimed when idle (VPC is 14 days) | the next invocation fails once → re-created in Pending |
| **Failed** | failed | delete and recreate |

In addition there's **`LastUpdateStatus`** (`Successful`/`Failed`/`InProgress`), and **while `InProgress`, the next `UpdateFunctionCode`/`UpdateFunctionConfiguration`/`PublishVersion` fails** (`ResourceConflictException` / 409). So:

- **Keep the order code update → `wait function-updated-v2` → config update/publish** (typical of failing with 409 when run consecutively in CI).
- **A VPC function takes time to reflect** (Hyperplane ENI; [details in the cold-start article](/blog/aws-lambda-cold-start-snapstart-provisioned-concurrency-performance-guide)). A test that hits right after a deploy should confirm Active first.

```bash
# CIでの安全な連続更新：各ステップで完了を待ってから次へ
aws lambda update-function-code --function-name orders --zip-file fileb://build.zip
aws lambda wait function-updated-v2 --function-name orders        # ← これを挟まないと409
aws lambda update-function-configuration --function-name orders --environment "Variables={LOG_LEVEL=INFO}"
aws lambda wait function-updated-v2 --function-name orders
aws lambda publish-version --function-name orders
```

---

## 7. Conclusion: a safe-deploy cheat sheet

- **Foundation**: `$LATEST` is mutable. Make **a published version (immutable) + an alias (pointer)**, and call **production via a qualified ARN (alias).**
- **Canary**: weighted alias (up to 2 versions, weight distribution). The conditions are **both versions published, identical execution role, identical DLQ.**
- **Automation**: CodeDeploy's `Canary10Percent5Minutes`, etc. + **verify with pre/post hooks** + **auto-rollback with CloudWatch alarms.** With SAM, a few lines of `AutoPublishAlias` + `DeploymentPreference`.
- **IaC**: the fastest safe deploy is **SAM**, type safety is **CDK**, multi-cloud/existing assets is **Terraform**, easy but **Serverless v4 is paid above $2M annual revenue.**
- **CI/CD**: **keyless with OIDC** (`id-token: write` + `AssumeRoleWithWebIdentity`, narrow repo/branch with `sub`).
- **Pitfall**: updates are asynchronous. **Wait for `LastUpdateStatus=Successful`** (an update during `InProgress` is 409). A VPC function takes time to reflect.

On the payment platform, I was thorough about the shipping discipline of "immutable version + canary + alarm auto-rollback + keyless CI/CD" that supported **0 double charges in production.** **Detecting at 10% and automatically reverting before a new version's defect hits all users** — this is the foundation for safely evolving an unstoppable payment platform.

**"I want to continuously ship my own Lambda in a form that doesn't stop, doesn't break, and can auto-revert" — from designing the canary strategy to making CI/CD keyless and selecting IaC, I accompany you at the speed of one person × generative AI (Claude Code).** From an audit of your existing deploy flow onward too, feel free to reach out.

---

### References (official documentation)

- [Lambda function versions](https://docs.aws.amazon.com/lambda/latest/dg/configuration-versions.html) / [aliases](https://docs.aws.amazon.com/lambda/latest/dg/configuration-aliases.html) — immutable versions, qualified/unqualified ARNs
- [Implementing canary deployments using alias routing](https://docs.aws.amazon.com/lambda/latest/dg/configuring-alias-routing.html) — weighted aliases, `AdditionalVersionWeights`, constraints
- [Deployment configurations (CodeDeploy)](https://docs.aws.amazon.com/codedeploy/latest/userguide/deployment-configurations.html) — `LambdaCanary…` / `LambdaLinear…` / `LambdaAllAtOnce`
- [Gradual deployments / DeploymentPreference (SAM)](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/automating-updates-to-serverless-apps.html) — `AutoPublishAlias`, `Type`/`Alarms`/`Hooks`
- [AWS::Serverless::Function DeploymentPreference](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-property-function-deploymentpreference.html) — automatic rollback via alarms
- [AWS CDK aws-codedeploy LambdaDeploymentGroup](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_codedeploy.LambdaDeploymentGroup.html) — safe deploy in CDK
- [Terraform aws_lambda_alias](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_alias) / [aws_codedeploy_app](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/codedeploy_app) — weighted routing, `compute_platform`
- [Configuring OpenID Connect in Amazon Web Services (GitHub Docs)](https://docs.github.com/en/actions/deployment/security-hardening-your-deployments/configuring-openid-connect-in-amazon-web-services) — `id-token: write`, `AssumeRoleWithWebIdentity`
- [Lambda function states](https://docs.aws.amazon.com/lambda/latest/dg/functions-states.html) — Pending/Active/Inactive/Failed, `LastUpdateStatus`
- [Serverless Framework pricing](https://www.serverless.com/pricing) — v4's paid condition ($2M revenue)
