Skip to main content
友田 陽大
AWS Lambda in production
AWS
Lambda
テスト
サーバーレス
テスト容易性

Lambda testing strategy: designing unit/integration/E2E, SDK mocking, sam local, and verifying in the cloud

An implementation guide to testing AWS Lambda at production quality. With real code faithful to the AWS official spec, it explains: the unit/integration/E2E AWS officially defines and the guidance to 'prioritize testing in the cloud,' unit tests of thin handlers and pure logic, SDK mocking with aws-sdk-client-mock/moto, where sam local is useful and its limits, and integration/async side-effect verification with disposable stacks.

Published
Reading time
11 min read
Author
友田 陽大
Share

"My Lambda tests all pass locally, but deploy to production and they fail on permission errors or differences in event shape" — the most common disappointment in serverless testing. The cause is clear: much of Lambda's logic lives not in the 'code' but in the 'cloud's service settings (IAM, event sources, quotas).' A local mock can't reproduce that crucial part.

This article is a strategy guide to testing AWS Lambda at production quality. Starting from AWS's official testing guidance, it explains end-to-end: unit tests of thin handlers, SDK mocking, where sam local is useful and its limits, and integration/E2E in the cloud. As a subject, I weave in the testing discipline that supported the serverless payment platform I built as a core developer (0 double charges in production). The execution model of Lambda itself is left to the sister article AWS Lambda production-operations guide; this piece concentrates on the single point of "how to test."

Rules for this article: the definitions, guidance, and tool names of testing are based on the AWS official documentation (the Lambda testing guide, etc., as of June 2026) and each tool's official page. Tool versions and APIs are revised, so always confirm in the official docs ("References" at the end) before production.


0. Mental model: prioritize "testing in the cloud" above all

First, receive AWS official's most important message. This determines the design of serverless testing.

  • The official guidance: prioritize testing in the cloud. The official docs state plainly — "cloud-based testing most accurately measures the quality of your functions and applications" "only the cloud can comprehensively test security policies, service settings, quotas, and even the latest API signatures."
  • The limits of mocks: "tests using mocks can pass on paper but fail in the cloud. The results may not match the latest API, and service settings and quotas can't be tested."
  • The limits of emulators: "tests relying on an emulator can succeed locally but fail in the cloud (due to production security policies, inter-service settings, quota overruns)" "use emulators sparingly."
  • So the weighting changes: raise the weight of integration testing more than for an ordinary app. Because much of the logic lives in service settings (IAM, event mapping). "The managed service itself needs no testing, but the integration with it needs testing."

Design the three layers the official docs define in this order.

LayerOfficial definitionExample
unitA test against an isolated code blockVerifying the business logic of shipping-fee calculation
integrationThe interaction of two or more components/services (usually the cloud)Verifying that a function processes a queue event
E2EThe behavior of the whole appAn event flows between services and an order is recorded

1. Unit test: a thin handler + pure logic

The official testability guidance is one line — "make the handler a thin adapter that receives the event and passes only the details the business logic needs." This way, you can solidify the business logic with ordinary unit tests without caring about Lambda-specific circumstances.

// domain.ts — Lambdaを一切importしない純粋ロジック。最速・最安で大量のケースを回せる
export interface Order { id: string; amount: number; destination: "domestic" | "international"; }

export function deliveryFee(order: Order): number {
  if (order.amount < 0) throw new RangeError("amount must be non-negative"); // 境界で検証
  const base = order.destination === "international" ? 2000 : 500;
  return order.amount >= 10_000 ? 0 : base; // 1万円以上は送料無料
}
// handler.ts — 薄いアダプタ。eventからの抽出・検証だけを担い、判断はdomainに委ねる
import type { APIGatewayProxyEventV2 } from "aws-lambda";
import { deliveryFee, type Order } from "./domain";

export const handler = async (event: APIGatewayProxyEventV2) => {
  const body = JSON.parse(event.body ?? "{}") as Partial<Order>;
  if (!body.id || typeof body.amount !== "number" || !body.destination) {
    return { statusCode: 422, body: JSON.stringify({ message: "invalid order" }) };
  }
  return { statusCode: 200, body: JSON.stringify({ fee: deliveryFee(body as Order) }) };
};
// domain.test.ts — 純粋関数なのでクラウドもDockerも不要。境界値を厚くテストする
import { describe, it, expect } from "vitest";
import { deliveryFee } from "./domain";

describe("deliveryFee", () => {
  it("国内は500円、国際は2000円", () => {
    expect(deliveryFee({ id: "a", amount: 100, destination: "domestic" })).toBe(500);
    expect(deliveryFee({ id: "b", amount: 100, destination: "international" })).toBe(2000);
  });
  it("1万円以上は送料無料", () => {
    expect(deliveryFee({ id: "c", amount: 10_000, destination: "international" })).toBe(0);
  });
  it("負の金額は拒否する", () => {
    expect(() => deliveryFee({ id: "d", amount: -1, destination: "domestic" })).toThrow(RangeError);
  });
});

Hit it with realistic events: to unit-test the handler itself, sam local generate-event generates sample events close to the real thing (API Gateway/S3/SQS, etc.). Hand-written events easily have the wrong shape, so base them on the generated output.


2. SDK mocking: cut dependencies to keep unit tests fast

Outside the business logic, you sometimes want to unit-test a thin layer that includes SDK calls like DynamoDB. Here, mock the SDK. Since it doesn't connect to real AWS, it's fast and deterministic.

2.1 TypeScript: aws-sdk-client-mock

The standard library to mock AWS SDK for JavaScript v3 clients is aws-sdk-client-mock (recommended by the AWS SDK for JavaScript team). Wrap with mockClient() and define responses with .on(Command).resolves(...).

// repo.test.ts — DynamoDBDocumentClient をモックし、本物のAWSに繋がず単体テスト
import { mockClient } from "aws-sdk-client-mock";
import { DynamoDBDocumentClient, GetCommand } from "@aws-sdk/lib-dynamodb";
import { describe, it, expect, beforeEach } from "vitest";
import { getOrder } from "./repo";

const ddbMock = mockClient(DynamoDBDocumentClient);
beforeEach(() => ddbMock.reset());

it("注文が見つかればドメインオブジェクトに変換して返す", async () => {
  ddbMock.on(GetCommand).resolves({ Item: { id: "o1", amount: 1200 } });
  await expect(getOrder("o1")).resolves.toEqual({ id: "o1", amount: 1200 });
});

it("見つからなければ null", async () => {
  ddbMock.on(GetCommand).resolves({ Item: undefined });
  await expect(getOrder("missing")).resolves.toBeNull();
});

2.2 Python: moto

In Python, moto (named by the official testing guide). The @mock_aws decorator mocks boto3 calls in-memory.

# test_repo.py — moto で DynamoDB をモック(実リソース不要・インメモリ)
import boto3
from moto import mock_aws
from repo import get_order

@mock_aws
def test_get_order_returns_none_when_missing():
    ddb = boto3.resource("dynamodb", region_name="ap-northeast-1")
    ddb.create_table(
        TableName="orders",
        KeySchema=[{"AttributeName": "id", "KeyType": "HASH"}],
        AttributeDefinitions=[{"AttributeName": "id", "AttributeType": "S"}],
        BillingMode="PAY_PER_REQUEST",
    )
    assert get_order("missing") is None

Testing Powertools: when testing functions that use structured logging/tracing/metrics, setting POWERTOOLS_DEV makes the Logger format output, disables the Tracer, and disables Metrics emission to standard output. Setting POWERTOOLS_SERVICE_NAME / POWERTOOLS_METRICS_NAMESPACE before the test prevents failing on metrics schema validation.

A note on mocks: as the official docs say, a mock can't verify IAM permissions, quotas, or service settings. Treat unit tests as solely for "logic correctness," and ensure the correctness of settings with cloud integration tests (next chapter).


3. Local verification: where sam local is useful and its limits

When you want to run quickly at hand, AWS SAM's local commands. Grasp three by role (all Docker-premised).

CommandWhat it doesDefault port
sam local invokeRun the function once locally (inject the event with --event)
sam local start-apiHit it as an API on a local HTTP server3000
sam local start-lambdaA local endpoint emulating the Lambda service (call from CLI/SDK)3001
# 生成イベントでローカル1回実行 → 入出力を素早く確認
sam local generate-event apigateway http-api-proxy > events/http.json
sam local invoke OrdersFunction --event events/http.json

Correctly understanding the limits is the focus of this chapter. The official docs state plainly — "local testing is good for rapid development and pre-deploy checks, but it can't verify permissions, etc., between cloud resources. Test in the cloud as much as possible." When you hit it with sam local and the function calls AWS, the invocation ID is no longer from the Lambda service and it doesn't assume the execution role — that is, it doesn't become IAM verification. So keep sam local to "a quick check of logic" and ensure production quality in the cloud.


4. Integration/E2E: in the cloud, with disposable stacks

Serverless quality, in the end, can't be ensured without testing against real services in the cloud. Drop the official recommended pattern into the implementation.

4.1 Stand up a disposable stack per branch/PR

The official docs make it a best practice to "create uniquely named resources in each stack" and "split stacks at the boundaries of code branches." In CI, deploy → test → destroy a stack with a unique prefix per PR.

# 結合テストCI(抜粋):PRごとに使い捨てスタックを立て、本物のサービスでテストし、必ず壊す
jobs:
  integration:
    runs-on: ubuntu-latest
    permissions: { id-token: write, contents: read } # OIDCで鍵レス(デプロイ記事参照)
    env: { STACK: orders-pr-${{ github.event.number }} }
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v6
        with: { role-to-assume: arn:aws:iam::123456789012:role/ci-deploy, aws-region: ap-northeast-1 }
      - uses: aws-actions/setup-sam@v2
      - run: sam build && sam deploy --stack-name "$STACK" --no-confirm-changeset --resolve-s3
      - run: npm run test:integration          # 本物のAPI/DB/キューに対して検証
      - if: always()                            # 成否に関わらずスタックを破棄(コスト/汚染防止)
        run: sam delete --stack-name "$STACK" --no-prompts

4.2 For async, verify the "side effect" by polling

Event-driven (SQS/EventBridge/Streams) has no return value, so observe the downstream side effect and assert. The official Arrange-Act-Assert: trigger the function (Act) → fetch the result from the destination (SQS/DynamoDB) and verify (Assert).

// 非同期フローの結合テスト:イベント投入 → 下流の副作用が現れるまでポーリングして検証
import { DynamoDBClient, GetItemCommand } from "@aws-sdk/client-dynamodb";

async function waitForItem(table: string, id: string, timeoutMs = 30_000) {
  const ddb = new DynamoDBClient({});
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const { Item } = await ddb.send(new GetItemCommand({ TableName: table, Key: { id: { S: id } } }));
    if (Item) return Item;                        // 副作用が現れた=関数が正しく処理した
    await new Promise((r) => setTimeout(r, 1000)); // 1秒間隔でポーリング
  }
  throw new Error(`item ${id} did not appear within ${timeoutMs}ms`); // 失敗を明確に
}

E2E doesn't use mocks (official). Hit API Gateway → Lambda → DynamoDB end-to-end and verify including each service's settings and IAM permissions. This is the only way to structurally crush "passes locally, fails in production."


5. The serverless test pyramid: rearrange the weighting

Rearrange an ordinary app's "more unit, less E2E" into heavier on integration for serverless (the official docs state plainly "focus on integration tests").

        ▲  E2E(少数・モック禁止)
       ╱ ╲   主要なユーザーフローを通しで。設定・IAM・サービス間連携を検証
      ╱   ╲
     ╱ 結合 ╲  ← サーバーレスはここが厚い
    ╱  多め  ╲   関数×実サービス(DB/キュー/API GW)。設定の正しさはここでしか分からない
   ╱─────────╲
  ╱   単体     ╲  最多・最速・最安。純粋ロジック+SDKモック。IAMやクォータは検証しない
 ╱─────────────╲

Design guidelines:

  • Unit: business logic (pure functions) and a thin data layer with SDK mocks. Fast, in bulk, thick on boundary values.
  • Integration: connect functions to real services and verify settings, permissions, event shape. In serverless, this is the main battleground of quality.
  • E2E: major flows end-to-end, no mocks. Keep the number narrow.
  • Local (sam local): not a "layer" of the pyramid but a means of quick feedback during development. Don't make it a substitute for quality assurance.

6. Conclusion: Lambda testing-strategy cheat sheet

  • The top priority is testing in the cloud (official). Mocks/emulators can't reproduce IAM, quotas, settings, or latest-API diffs. The true identity of "passes on paper, fails in production."
  • Unit: the handler is a thin adapter, business logic carved into pure functions and tested thickly. Use realistic events with sam local generate-event.
  • SDK mock: TypeScript = aws-sdk-client-mock (recommended by the AWS SDK for JS team), Python = moto (@mock_aws, official intro). Logic-only; doesn't verify settings.
  • sam local: invoke/start-api(3000)/start-lambda(3001) are Docker-premised. Since they can't verify permissions, use them sparingly.
  • Integration/E2E: stand up a per-branch/PR disposable stack → verify with real services → always destroy. For async, poll the side effect. E2E forbids mocks.
  • The pyramid: serverless is heavier on integration. Settings and IAM can only be protected there.

On the payment platform, I ensured 0 double charges in production not by "prayer" but by tests. I thickly unit-tested the idempotency logic with pure functions, verified the idempotent SQS→Lambda→DynamoDB side effects with integration tests on a disposable stack, and protected the payment flow end-to-end with E2E — this layered structure is the foundation for confidently continuing to change an unstoppable payment platform. For quality-gate design of AI-driven development including tests, also see how to make quality gates.

"I want to test our serverless in a form that reliably works in production, not on paper" — from designing the test pyramid through disposable-stack CI and async side-effect verification, I accompany you at the speed of one person × generative AI (Claude Code). Feel free to consult me even starting from a diagnosis of your current test strategy.


References (official documentation)

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading