# Lambda testing strategy: designing unit/integration/E2E, SDK mocking, sam local, and verifying in the cloud

> An implementation guide to testing AWS Lambda at production quality. With real code faithful to the AWS official spec, it explains: the unit/integration/E2E AWS officially defines and the guidance to 'prioritize testing in the cloud,' unit tests of thin handlers and pure logic, SDK mocking with aws-sdk-client-mock/moto, where sam local is useful and its limits, and integration/async side-effect verification with disposable stacks.

- Published: 2026-06-28
- Author: 友田 陽大
- Tags: AWS, Lambda, テスト, サーバーレス, テスト容易性
- URL: https://tomodahinata.com/en/blog/aws-lambda-testing-strategy-unit-integration-mocking-sam-local-guide
- Category: AWS Lambda in production
- Pillar guide: https://tomodahinata.com/en/blog/aws-lambda-production-guide

## Key points

- AWS's official guidance is clear: 'prioritize testing in the cloud above all.' Mocks pass on paper but fail in the cloud (they can't reproduce IAM, quotas, service settings, or latest-API diffs).
- Make the handler a thin adapter and carve business logic into pure functions for unit testing. Leave only extraction/validation from the event in the handler.
- For SDK mocking, in TypeScript aws-sdk-client-mock (recommended by the AWS SDK for JS team), in Python moto (@mock_aws, named by the official docs).
- sam local invoke/start-api(3000)/start-lambda(3001) is Docker-premised local verification. Since it can't verify permissions, etc., use it 'sparingly,' and ensure production quality in the cloud.
- Serverless weights integration testing heavily. Since much of the logic lives in service settings, stand up a per-branch disposable stack, and for async, poll-verify the side effects (SQS/DynamoDB).

---

"My Lambda tests all pass locally, but deploy to production and they fail on permission errors or differences in event shape" — the most common disappointment in serverless testing. The cause is clear: **much of Lambda's logic lives not in the 'code' but in the 'cloud's service settings (IAM, event sources, quotas).'** A local mock can't reproduce that crucial part.

This article is a strategy guide to **testing AWS Lambda at production quality.** Starting from **AWS's official testing guidance**, it explains end-to-end: **unit tests of thin handlers**, **SDK mocking**, **where sam local is useful and its limits**, and **integration/E2E in the cloud.** As a subject, I weave in the testing discipline that supported the [serverless payment platform](/case-studies/payment-platform-reliability) I built as a core developer (**0 double charges in production**). The execution model of Lambda itself is left to the sister article [AWS Lambda production-operations guide](/blog/aws-lambda-production-guide); this piece concentrates on **the single point of "how to test."**

> **Rules for this article**: the definitions, guidance, and tool names of testing are based on the **AWS official documentation (the Lambda testing guide, etc., as of June 2026)** and each tool's official page. Tool versions and APIs are revised, so always confirm in the official docs ("References" at the end) before production.

---

## 0. Mental model: prioritize "testing in the cloud" above all

First, receive AWS official's most important message. This determines the design of serverless testing.

- **The official guidance: prioritize testing in the cloud.** The official docs state plainly — "**cloud-based testing most accurately measures the quality of your functions and applications**" "only the cloud can comprehensively test security policies, service settings, quotas, and even the latest API signatures."
- **The limits of mocks**: "**tests using mocks can pass on paper but fail in the cloud.** The results may not match the latest API, and service settings and quotas can't be tested."
- **The limits of emulators**: "tests relying on an emulator can succeed locally but fail in the cloud (due to production security policies, inter-service settings, quota overruns)" "**use emulators sparingly.**"
- **So the weighting changes**: **raise the weight of integration testing** more than for an ordinary app. Because much of the logic lives in service settings (IAM, event mapping). "**The managed service itself needs no testing, but the integration with it needs testing.**"

Design the three layers the official docs define in this order.

| Layer | Official definition | Example |
| --- | --- | --- |
| **unit** | A test against an isolated code block | Verifying the business logic of shipping-fee calculation |
| **integration** | The interaction of two or more components/services (usually the cloud) | Verifying that a function processes a queue event |
| **E2E** | The behavior of the whole app | An event flows between services and an order is recorded |

---

## 1. Unit test: a thin handler + pure logic

The official testability guidance is one line — "**make the handler a thin adapter that receives the event and passes only the details the business logic needs.**" This way, you can solidify **the business logic with ordinary unit tests** without caring about Lambda-specific circumstances.

```ts
// domain.ts — Lambdaを一切importしない純粋ロジック。最速・最安で大量のケースを回せる
export interface Order { id: string; amount: number; destination: "domestic" | "international"; }

export function deliveryFee(order: Order): number {
  if (order.amount < 0) throw new RangeError("amount must be non-negative"); // 境界で検証
  const base = order.destination === "international" ? 2000 : 500;
  return order.amount >= 10_000 ? 0 : base; // 1万円以上は送料無料
}
```

```ts
// handler.ts — 薄いアダプタ。eventからの抽出・検証だけを担い、判断はdomainに委ねる
import type { APIGatewayProxyEventV2 } from "aws-lambda";
import { deliveryFee, type Order } from "./domain";

export const handler = async (event: APIGatewayProxyEventV2) => {
  const body = JSON.parse(event.body ?? "{}") as Partial<Order>;
  if (!body.id || typeof body.amount !== "number" || !body.destination) {
    return { statusCode: 422, body: JSON.stringify({ message: "invalid order" }) };
  }
  return { statusCode: 200, body: JSON.stringify({ fee: deliveryFee(body as Order) }) };
};
```

```ts
// domain.test.ts — 純粋関数なのでクラウドもDockerも不要。境界値を厚くテストする
import { describe, it, expect } from "vitest";
import { deliveryFee } from "./domain";

describe("deliveryFee", () => {
  it("国内は500円、国際は2000円", () => {
    expect(deliveryFee({ id: "a", amount: 100, destination: "domestic" })).toBe(500);
    expect(deliveryFee({ id: "b", amount: 100, destination: "international" })).toBe(2000);
  });
  it("1万円以上は送料無料", () => {
    expect(deliveryFee({ id: "c", amount: 10_000, destination: "international" })).toBe(0);
  });
  it("負の金額は拒否する", () => {
    expect(() => deliveryFee({ id: "d", amount: -1, destination: "domestic" })).toThrow(RangeError);
  });
});
```

> **Hit it with realistic events**: to unit-test the handler itself, `sam local generate-event` generates **sample events close to the real thing** (API Gateway/S3/SQS, etc.). Hand-written events easily have the wrong shape, so base them on the generated output.

---

## 2. SDK mocking: cut dependencies to keep unit tests fast

Outside the business logic, you sometimes want to unit-test a thin layer that includes SDK calls like DynamoDB. Here, **mock the SDK.** Since it **doesn't connect to real AWS**, it's fast and deterministic.

### 2.1 TypeScript: aws-sdk-client-mock

The standard library to mock AWS SDK for JavaScript v3 clients is **`aws-sdk-client-mock`** (**recommended by the AWS SDK for JavaScript team**). Wrap with `mockClient()` and define responses with `.on(Command).resolves(...)`.

```ts
// repo.test.ts — DynamoDBDocumentClient をモックし、本物のAWSに繋がず単体テスト
import { mockClient } from "aws-sdk-client-mock";
import { DynamoDBDocumentClient, GetCommand } from "@aws-sdk/lib-dynamodb";
import { describe, it, expect, beforeEach } from "vitest";
import { getOrder } from "./repo";

const ddbMock = mockClient(DynamoDBDocumentClient);
beforeEach(() => ddbMock.reset());

it("注文が見つかればドメインオブジェクトに変換して返す", async () => {
  ddbMock.on(GetCommand).resolves({ Item: { id: "o1", amount: 1200 } });
  await expect(getOrder("o1")).resolves.toEqual({ id: "o1", amount: 1200 });
});

it("見つからなければ null", async () => {
  ddbMock.on(GetCommand).resolves({ Item: undefined });
  await expect(getOrder("missing")).resolves.toBeNull();
});
```

### 2.2 Python: moto

In Python, **`moto`** (named by the official testing guide). The `@mock_aws` decorator mocks boto3 calls in-memory.

```python
# test_repo.py — moto で DynamoDB をモック（実リソース不要・インメモリ）
import boto3
from moto import mock_aws
from repo import get_order

@mock_aws
def test_get_order_returns_none_when_missing():
    ddb = boto3.resource("dynamodb", region_name="ap-northeast-1")
    ddb.create_table(
        TableName="orders",
        KeySchema=[{"AttributeName": "id", "KeyType": "HASH"}],
        AttributeDefinitions=[{"AttributeName": "id", "AttributeType": "S"}],
        BillingMode="PAY_PER_REQUEST",
    )
    assert get_order("missing") is None
```

> **Testing Powertools**: when testing functions that use structured logging/tracing/metrics, setting `POWERTOOLS_DEV` makes the Logger format output, disables the Tracer, and disables Metrics emission to standard output. Setting `POWERTOOLS_SERVICE_NAME` / `POWERTOOLS_METRICS_NAMESPACE` before the test prevents failing on metrics schema validation.
>
> **A note on mocks**: as the official docs say, a mock **can't verify IAM permissions, quotas, or service settings.** Treat unit tests as solely for "logic correctness," and ensure **the correctness of settings with cloud integration tests** (next chapter).

---

## 3. Local verification: where sam local is useful and its limits

When you want to run quickly at hand, **AWS SAM's local commands.** Grasp three by role (all **Docker-premised**).

| Command | What it does | Default port |
| --- | --- | --- |
| `sam local invoke` | **Run the function once locally** (inject the event with `--event`) | — |
| `sam local start-api` | Hit it as an API on a local HTTP server | 3000 |
| `sam local start-lambda` | A local endpoint emulating the Lambda service (call from CLI/SDK) | 3001 |

```bash
# 生成イベントでローカル1回実行 → 入出力を素早く確認
sam local generate-event apigateway http-api-proxy > events/http.json
sam local invoke OrdersFunction --event events/http.json
```

**Correctly understanding the limits** is the focus of this chapter. The official docs state plainly — "local testing is good for rapid development and pre-deploy checks, but **it can't verify permissions, etc., between cloud resources.** **Test in the cloud as much as possible.**" When you hit it with `sam local` and the function calls AWS, **the invocation ID is no longer from the Lambda service and it doesn't assume the execution role** — that is, **it doesn't become IAM verification.** So keep sam local to "a quick check of logic" and **ensure production quality in the cloud.**

---

## 4. Integration/E2E: in the cloud, with disposable stacks

Serverless quality, in the end, can't be ensured without **testing against real services in the cloud.** Drop the official recommended pattern into the implementation.

### 4.1 Stand up a disposable stack per branch/PR

The official docs make it a best practice to "**create uniquely named resources in each stack**" and "**split stacks at the boundaries of code branches.**" In CI, **deploy → test → destroy a stack with a unique prefix per PR.**

```yaml
# 結合テストCI（抜粋）：PRごとに使い捨てスタックを立て、本物のサービスでテストし、必ず壊す
jobs:
  integration:
    runs-on: ubuntu-latest
    permissions: { id-token: write, contents: read } # OIDCで鍵レス（デプロイ記事参照）
    env: { STACK: orders-pr-${{ github.event.number }} }
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v6
        with: { role-to-assume: arn:aws:iam::123456789012:role/ci-deploy, aws-region: ap-northeast-1 }
      - uses: aws-actions/setup-sam@v2
      - run: sam build && sam deploy --stack-name "$STACK" --no-confirm-changeset --resolve-s3
      - run: npm run test:integration          # 本物のAPI/DB/キューに対して検証
      - if: always()                            # 成否に関わらずスタックを破棄（コスト/汚染防止）
        run: sam delete --stack-name "$STACK" --no-prompts
```

### 4.2 For async, verify the "side effect" by polling

Event-driven (SQS/EventBridge/Streams) has **no return value**, so **observe the downstream side effect** and assert. The official Arrange-Act-Assert: trigger the function (Act) → **fetch the result from the destination (SQS/DynamoDB) and verify (Assert).**

```ts
// 非同期フローの結合テスト：イベント投入 → 下流の副作用が現れるまでポーリングして検証
import { DynamoDBClient, GetItemCommand } from "@aws-sdk/client-dynamodb";

async function waitForItem(table: string, id: string, timeoutMs = 30_000) {
  const ddb = new DynamoDBClient({});
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const { Item } = await ddb.send(new GetItemCommand({ TableName: table, Key: { id: { S: id } } }));
    if (Item) return Item;                        // 副作用が現れた＝関数が正しく処理した
    await new Promise((r) => setTimeout(r, 1000)); // 1秒間隔でポーリング
  }
  throw new Error(`item ${id} did not appear within ${timeoutMs}ms`); // 失敗を明確に
}
```

> **E2E doesn't use mocks** (official). Hit API Gateway → Lambda → DynamoDB end-to-end and verify **including each service's settings and IAM permissions.** This is the only way to structurally crush "passes locally, fails in production."

---

## 5. The serverless test pyramid: rearrange the weighting

Rearrange an ordinary app's "more unit, less E2E" into **heavier on integration** for serverless (the official docs state plainly "focus on integration tests").

```text
        ▲  E2E（少数・モック禁止）
       ╱ ╲   主要なユーザーフローを通しで。設定・IAM・サービス間連携を検証
      ╱   ╲
     ╱ 結合 ╲  ← サーバーレスはここが厚い
    ╱  多め  ╲   関数×実サービス（DB/キュー/API GW）。設定の正しさはここでしか分からない
   ╱─────────╲
  ╱   単体     ╲  最多・最速・最安。純粋ロジック＋SDKモック。IAMやクォータは検証しない
 ╱─────────────╲
```

Design guidelines:

- **Unit**: business logic (pure functions) and a thin data layer with SDK mocks. **Fast, in bulk, thick on boundary values.**
- **Integration**: connect functions to real services and verify **settings, permissions, event shape.** In serverless, **this is the main battleground of quality.**
- **E2E**: major flows end-to-end, **no mocks.** Keep the number narrow.
- **Local (sam local)**: not a "layer" of the pyramid but **a means of quick feedback during development.** Don't make it a substitute for quality assurance.

---

## 6. Conclusion: Lambda testing-strategy cheat sheet

- **The top priority is testing in the cloud** (official). Mocks/emulators **can't reproduce IAM, quotas, settings, or latest-API diffs.** The true identity of "passes on paper, fails in production."
- **Unit**: the handler is a thin adapter, **business logic carved into pure functions** and tested thickly. Use realistic events with `sam local generate-event`.
- **SDK mock**: TypeScript = **aws-sdk-client-mock** (recommended by the AWS SDK for JS team), Python = **moto** (`@mock_aws`, official intro). Logic-only; doesn't verify settings.
- **sam local**: `invoke`/`start-api`(3000)/`start-lambda`(3001) are Docker-premised. Since they **can't verify permissions**, use them sparingly.
- **Integration/E2E**: stand up a **per-branch/PR disposable stack** → verify with real services → **always destroy.** For async, **poll the side effect.** E2E forbids mocks.
- **The pyramid**: serverless is **heavier on integration.** Settings and IAM can only be protected there.

On the payment platform, I ensured **0 double charges in production** not by "prayer" but **by tests.** I thickly unit-tested the idempotency logic with pure functions, verified the idempotent SQS→Lambda→DynamoDB side effects with integration tests on a disposable stack, and protected the payment flow end-to-end with E2E — this layered structure is the foundation for confidently continuing to change an unstoppable payment platform. For quality-gate design of AI-driven development including tests, also see [how to make quality gates](/blog/ai-driven-development-quality-gates-ci-type-safety-test-security).

**"I want to test our serverless in a form that reliably works in production, not on paper" — from designing the test pyramid through disposable-stack CI and async side-effect verification, I accompany you at the speed of one person × generative AI (Claude Code).** Feel free to consult me even starting from a diagnosis of your current test strategy.

---

### References (official documentation)

- [Testing serverless functions and applications](https://docs.aws.amazon.com/lambda/latest/dg/testing-guide.html) — definitions of unit/integration/E2E, cloud-first, emulators sparingly, focus on integration
- [aws-samples/serverless-test-samples](https://github.com/aws-samples/serverless-test-samples) — per-language examples (official reference)
- [aws-sdk-client-mock](https://github.com/m-radzikowski/aws-sdk-client-mock) — mocking AWS SDK for JavaScript v3 (`mockClient`/`.on`/`.resolves`)
- [moto](https://pypi.org/project/moto/) — AWS mock for Python (`@mock_aws`)
- [sam local invoke](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-local-invoke.html) / [start-api](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-local-start-api.html) / [start-lambda](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/using-sam-cli-local-start-lambda.html) — the three local-verification commands and their limits
- [sam local generate-event](https://docs.aws.amazon.com/serverless-application-model/latest/developerguide/sam-cli-command-reference-sam-local-generate-event.html) — sample-event generation
- [Powertools environment variables](https://docs.aws.amazon.com/powertools/typescript/latest/environment-variables/) — `POWERTOOLS_DEV`, etc., behavior during tests
