"My Lambda tests all pass locally, but deploy to production and they fail on permission errors or differences in event shape" — the most common disappointment in serverless testing. The cause is clear: much of Lambda's logic lives not in the 'code' but in the 'cloud's service settings (IAM, event sources, quotas).' A local mock can't reproduce that crucial part.
This article is a strategy guide to testing AWS Lambda at production quality. Starting from AWS's official testing guidance, it explains end-to-end: unit tests of thin handlers, SDK mocking, where sam local is useful and its limits, and integration/E2E in the cloud. As a subject, I weave in the testing discipline that supported the serverless payment platform I built as a core developer (0 double charges in production). The execution model of Lambda itself is left to the sister article AWS Lambda production-operations guide; this piece concentrates on the single point of "how to test."
Rules for this article: the definitions, guidance, and tool names of testing are based on the AWS official documentation (the Lambda testing guide, etc., as of June 2026) and each tool's official page. Tool versions and APIs are revised, so always confirm in the official docs ("References" at the end) before production.
0. Mental model: prioritize "testing in the cloud" above all
First, receive AWS official's most important message. This determines the design of serverless testing.
- The official guidance: prioritize testing in the cloud. The official docs state plainly — "cloud-based testing most accurately measures the quality of your functions and applications" "only the cloud can comprehensively test security policies, service settings, quotas, and even the latest API signatures."
- The limits of mocks: "tests using mocks can pass on paper but fail in the cloud. The results may not match the latest API, and service settings and quotas can't be tested."
- The limits of emulators: "tests relying on an emulator can succeed locally but fail in the cloud (due to production security policies, inter-service settings, quota overruns)" "use emulators sparingly."
- So the weighting changes: raise the weight of integration testing more than for an ordinary app. Because much of the logic lives in service settings (IAM, event mapping). "The managed service itself needs no testing, but the integration with it needs testing."
Design the three layers the official docs define in this order.
| Layer | Official definition | Example |
|---|---|---|
| unit | A test against an isolated code block | Verifying the business logic of shipping-fee calculation |
| integration | The interaction of two or more components/services (usually the cloud) | Verifying that a function processes a queue event |
| E2E | The behavior of the whole app | An event flows between services and an order is recorded |
1. Unit test: a thin handler + pure logic
The official testability guidance is one line — "make the handler a thin adapter that receives the event and passes only the details the business logic needs." This way, you can solidify the business logic with ordinary unit tests without caring about Lambda-specific circumstances.
// domain.ts — Lambdaを一切importしない純粋ロジック。最速・最安で大量のケースを回せる
export interface Order { id: string; amount: number; destination: "domestic" | "international"; }
export function deliveryFee(order: Order): number {
if (order.amount < 0) throw new RangeError("amount must be non-negative"); // 境界で検証
const base = order.destination === "international" ? 2000 : 500;
return order.amount >= 10_000 ? 0 : base; // 1万円以上は送料無料
}
// handler.ts — 薄いアダプタ。eventからの抽出・検証だけを担い、判断はdomainに委ねる
import type { APIGatewayProxyEventV2 } from "aws-lambda";
import { deliveryFee, type Order } from "./domain";
export const handler = async (event: APIGatewayProxyEventV2) => {
const body = JSON.parse(event.body ?? "{}") as Partial<Order>;
if (!body.id || typeof body.amount !== "number" || !body.destination) {
return { statusCode: 422, body: JSON.stringify({ message: "invalid order" }) };
}
return { statusCode: 200, body: JSON.stringify({ fee: deliveryFee(body as Order) }) };
};
// domain.test.ts — 純粋関数なのでクラウドもDockerも不要。境界値を厚くテストする
import { describe, it, expect } from "vitest";
import { deliveryFee } from "./domain";
describe("deliveryFee", () => {
it("国内は500円、国際は2000円", () => {
expect(deliveryFee({ id: "a", amount: 100, destination: "domestic" })).toBe(500);
expect(deliveryFee({ id: "b", amount: 100, destination: "international" })).toBe(2000);
});
it("1万円以上は送料無料", () => {
expect(deliveryFee({ id: "c", amount: 10_000, destination: "international" })).toBe(0);
});
it("負の金額は拒否する", () => {
expect(() => deliveryFee({ id: "d", amount: -1, destination: "domestic" })).toThrow(RangeError);
});
});
Hit it with realistic events: to unit-test the handler itself,
sam local generate-eventgenerates sample events close to the real thing (API Gateway/S3/SQS, etc.). Hand-written events easily have the wrong shape, so base them on the generated output.
2. SDK mocking: cut dependencies to keep unit tests fast
Outside the business logic, you sometimes want to unit-test a thin layer that includes SDK calls like DynamoDB. Here, mock the SDK. Since it doesn't connect to real AWS, it's fast and deterministic.
2.1 TypeScript: aws-sdk-client-mock
The standard library to mock AWS SDK for JavaScript v3 clients is aws-sdk-client-mock (recommended by the AWS SDK for JavaScript team). Wrap with mockClient() and define responses with .on(Command).resolves(...).
// repo.test.ts — DynamoDBDocumentClient をモックし、本物のAWSに繋がず単体テスト
import { mockClient } from "aws-sdk-client-mock";
import { DynamoDBDocumentClient, GetCommand } from "@aws-sdk/lib-dynamodb";
import { describe, it, expect, beforeEach } from "vitest";
import { getOrder } from "./repo";
const ddbMock = mockClient(DynamoDBDocumentClient);
beforeEach(() => ddbMock.reset());
it("注文が見つかればドメインオブジェクトに変換して返す", async () => {
ddbMock.on(GetCommand).resolves({ Item: { id: "o1", amount: 1200 } });
await expect(getOrder("o1")).resolves.toEqual({ id: "o1", amount: 1200 });
});
it("見つからなければ null", async () => {
ddbMock.on(GetCommand).resolves({ Item: undefined });
await expect(getOrder("missing")).resolves.toBeNull();
});
2.2 Python: moto
In Python, moto (named by the official testing guide). The @mock_aws decorator mocks boto3 calls in-memory.
# test_repo.py — moto で DynamoDB をモック(実リソース不要・インメモリ)
import boto3
from moto import mock_aws
from repo import get_order
@mock_aws
def test_get_order_returns_none_when_missing():
ddb = boto3.resource("dynamodb", region_name="ap-northeast-1")
ddb.create_table(
TableName="orders",
KeySchema=[{"AttributeName": "id", "KeyType": "HASH"}],
AttributeDefinitions=[{"AttributeName": "id", "AttributeType": "S"}],
BillingMode="PAY_PER_REQUEST",
)
assert get_order("missing") is None
Testing Powertools: when testing functions that use structured logging/tracing/metrics, setting
POWERTOOLS_DEVmakes the Logger format output, disables the Tracer, and disables Metrics emission to standard output. SettingPOWERTOOLS_SERVICE_NAME/POWERTOOLS_METRICS_NAMESPACEbefore the test prevents failing on metrics schema validation.A note on mocks: as the official docs say, a mock can't verify IAM permissions, quotas, or service settings. Treat unit tests as solely for "logic correctness," and ensure the correctness of settings with cloud integration tests (next chapter).
3. Local verification: where sam local is useful and its limits
When you want to run quickly at hand, AWS SAM's local commands. Grasp three by role (all Docker-premised).
| Command | What it does | Default port |
|---|---|---|
sam local invoke | Run the function once locally (inject the event with --event) | — |
sam local start-api | Hit it as an API on a local HTTP server | 3000 |
sam local start-lambda | A local endpoint emulating the Lambda service (call from CLI/SDK) | 3001 |
# 生成イベントでローカル1回実行 → 入出力を素早く確認
sam local generate-event apigateway http-api-proxy > events/http.json
sam local invoke OrdersFunction --event events/http.json
Correctly understanding the limits is the focus of this chapter. The official docs state plainly — "local testing is good for rapid development and pre-deploy checks, but it can't verify permissions, etc., between cloud resources. Test in the cloud as much as possible." When you hit it with sam local and the function calls AWS, the invocation ID is no longer from the Lambda service and it doesn't assume the execution role — that is, it doesn't become IAM verification. So keep sam local to "a quick check of logic" and ensure production quality in the cloud.
4. Integration/E2E: in the cloud, with disposable stacks
Serverless quality, in the end, can't be ensured without testing against real services in the cloud. Drop the official recommended pattern into the implementation.
4.1 Stand up a disposable stack per branch/PR
The official docs make it a best practice to "create uniquely named resources in each stack" and "split stacks at the boundaries of code branches." In CI, deploy → test → destroy a stack with a unique prefix per PR.
# 結合テストCI(抜粋):PRごとに使い捨てスタックを立て、本物のサービスでテストし、必ず壊す
jobs:
integration:
runs-on: ubuntu-latest
permissions: { id-token: write, contents: read } # OIDCで鍵レス(デプロイ記事参照)
env: { STACK: orders-pr-${{ github.event.number }} }
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v6
with: { role-to-assume: arn:aws:iam::123456789012:role/ci-deploy, aws-region: ap-northeast-1 }
- uses: aws-actions/setup-sam@v2
- run: sam build && sam deploy --stack-name "$STACK" --no-confirm-changeset --resolve-s3
- run: npm run test:integration # 本物のAPI/DB/キューに対して検証
- if: always() # 成否に関わらずスタックを破棄(コスト/汚染防止)
run: sam delete --stack-name "$STACK" --no-prompts
4.2 For async, verify the "side effect" by polling
Event-driven (SQS/EventBridge/Streams) has no return value, so observe the downstream side effect and assert. The official Arrange-Act-Assert: trigger the function (Act) → fetch the result from the destination (SQS/DynamoDB) and verify (Assert).
// 非同期フローの結合テスト:イベント投入 → 下流の副作用が現れるまでポーリングして検証
import { DynamoDBClient, GetItemCommand } from "@aws-sdk/client-dynamodb";
async function waitForItem(table: string, id: string, timeoutMs = 30_000) {
const ddb = new DynamoDBClient({});
const deadline = Date.now() + timeoutMs;
while (Date.now() < deadline) {
const { Item } = await ddb.send(new GetItemCommand({ TableName: table, Key: { id: { S: id } } }));
if (Item) return Item; // 副作用が現れた=関数が正しく処理した
await new Promise((r) => setTimeout(r, 1000)); // 1秒間隔でポーリング
}
throw new Error(`item ${id} did not appear within ${timeoutMs}ms`); // 失敗を明確に
}
E2E doesn't use mocks (official). Hit API Gateway → Lambda → DynamoDB end-to-end and verify including each service's settings and IAM permissions. This is the only way to structurally crush "passes locally, fails in production."
5. The serverless test pyramid: rearrange the weighting
Rearrange an ordinary app's "more unit, less E2E" into heavier on integration for serverless (the official docs state plainly "focus on integration tests").
▲ E2E(少数・モック禁止)
╱ ╲ 主要なユーザーフローを通しで。設定・IAM・サービス間連携を検証
╱ ╲
╱ 結合 ╲ ← サーバーレスはここが厚い
╱ 多め ╲ 関数×実サービス(DB/キュー/API GW)。設定の正しさはここでしか分からない
╱─────────╲
╱ 単体 ╲ 最多・最速・最安。純粋ロジック+SDKモック。IAMやクォータは検証しない
╱─────────────╲
Design guidelines:
- Unit: business logic (pure functions) and a thin data layer with SDK mocks. Fast, in bulk, thick on boundary values.
- Integration: connect functions to real services and verify settings, permissions, event shape. In serverless, this is the main battleground of quality.
- E2E: major flows end-to-end, no mocks. Keep the number narrow.
- Local (sam local): not a "layer" of the pyramid but a means of quick feedback during development. Don't make it a substitute for quality assurance.
6. Conclusion: Lambda testing-strategy cheat sheet
- The top priority is testing in the cloud (official). Mocks/emulators can't reproduce IAM, quotas, settings, or latest-API diffs. The true identity of "passes on paper, fails in production."
- Unit: the handler is a thin adapter, business logic carved into pure functions and tested thickly. Use realistic events with
sam local generate-event. - SDK mock: TypeScript = aws-sdk-client-mock (recommended by the AWS SDK for JS team), Python = moto (
@mock_aws, official intro). Logic-only; doesn't verify settings. - sam local:
invoke/start-api(3000)/start-lambda(3001) are Docker-premised. Since they can't verify permissions, use them sparingly. - Integration/E2E: stand up a per-branch/PR disposable stack → verify with real services → always destroy. For async, poll the side effect. E2E forbids mocks.
- The pyramid: serverless is heavier on integration. Settings and IAM can only be protected there.
On the payment platform, I ensured 0 double charges in production not by "prayer" but by tests. I thickly unit-tested the idempotency logic with pure functions, verified the idempotent SQS→Lambda→DynamoDB side effects with integration tests on a disposable stack, and protected the payment flow end-to-end with E2E — this layered structure is the foundation for confidently continuing to change an unstoppable payment platform. For quality-gate design of AI-driven development including tests, also see how to make quality gates.
"I want to test our serverless in a form that reliably works in production, not on paper" — from designing the test pyramid through disposable-stack CI and async side-effect verification, I accompany you at the speed of one person × generative AI (Claude Code). Feel free to consult me even starting from a diagnosis of your current test strategy.
References (official documentation)
- Testing serverless functions and applications — definitions of unit/integration/E2E, cloud-first, emulators sparingly, focus on integration
- aws-samples/serverless-test-samples — per-language examples (official reference)
- aws-sdk-client-mock — mocking AWS SDK for JavaScript v3 (
mockClient/.on/.resolves) - moto — AWS mock for Python (
@mock_aws) - sam local invoke / start-api / start-lambda — the three local-verification commands and their limits
- sam local generate-event — sample-event generation
- Powertools environment variables —
POWERTOOLS_DEV, etc., behavior during tests