Skip to main content
友田 陽大
Procurement, in-house & cost
決済
Stripe
信頼性
受託開発
発注
アーキテクチャ設計

How to build a payment system that prevents double charges, and a procurement checklist: guaranteeing 'correctness' structurally with idempotency and atomicity

An explanation of preventing double charges and balance inconsistencies in payment/billing systems with the structure of the code, not the carefulness of operations. The core of payment reliability — idempotency keys, atomic transactions, and webhook deduplication — and a checklist buyers should demand of vendors, systematized from the real example of a payments platform that maintains 0 double charges during production operation.

Published
Reading time
9 min read
Author
友田 陽大
Share

First, the most important thing in a payment system. Double charges and balance inconsistencies can absolutely not be prevented by "operating carefully." Only by using the code structure — idempotency and atomicity — to make the incident "impossible" can you prevent it. The network drops, requests duplicate, and webhooks arrive twice out of order. Only a system designed with this as a "premise" keeps running correctly in production without stopping payments.

This article, based on the real example of a serverless payments platform where I designed and led the payment-reliability layer and maintain 0 double charges and balance inconsistencies during production operation, summarizes (1) why double charges happen, (2) how to prevent them structurally, and (3) what buyers should demand of vendors. It's useful both for those commissioning a system involving payments and those implementing it.


1. Why double charges happen

Double charges don't happen from a developer's carelessness. They happen from the essential nature of distributed systems.

[スマホ] ──決済リクエスト──> [サーバ]
   │                            │ 課金処理(成功)
   │  <──── タイムアウト ────  │ ← 応答が返る前に回線が切れる
   │                            
   │  「失敗したかも」と思い再送 │
   │ ──決済リクエスト(再送)─> [サーバ]
   │                            │ また課金してしまう ← 二重課金!

A mobile-line timeout, automatic retries of API Gateway or Lambda, a user's "no response, so tap again" — these are normal behavior. The problem is that unless the server side anticipates that "the same payment request arrives multiple times," it charges each time.

In other words, you can't (and shouldn't) stop the retries themselves. What you should do is design so that "no matter how many times it arrives, the charge converges to once." This is idempotency.


2. The first pillar: converge to "once" with an idempotency key

Idempotency is the property that "no matter how many times you execute the same operation, the result is the same as executing it once." In payments, you achieve it with a unique idempotency key issued by the client.

The idea is simple. Check "is this key's processing the first time?" with a conditional write, and process only the first time.

def already_processed(idempotency_key: str) -> bool:
    try:
        # キーが存在しない時だけ書き込む(条件付き挿入)
        table.put_item(
            Item={"idem_key": idempotency_key, "ttl": now() + NINETY_DAYS},
            ConditionExpression="attribute_not_exists(idem_key)",
        )
        return False  # 初回 → これから処理する
    except ConditionalCheckFailedException:
        return True   # 既に処理済み → スキップ(二重課金を防ぐ)

The point is to execute this duplicate check and the actual charge processing atomically within the same transaction. In the payments platform I worked on, I bundle the idempotency-key insertion, the balance update, and the transaction-history record into a single TransactWriteItems (DynamoDB's atomic transaction). With this, half-finished states like "the duplicate check passed but the charge failed" or "the charge happened but no record remained" can't in principle occur.

The crux of the design: idempotency is not a "feature added later" but the skeleton of payment processing. If deduplication and the main processing are split across separate transactions, an incident happens in that gap. Bundling them into the same transaction was the core of 0 double charges in production.


3. The second pillar: win against "races" with atomic balance updates

An incident alongside double charges is balance inconsistency. When a payment, charge, and refund run simultaneously against the same account, a typical "read → compute → write back" process causes a race.

時刻 t1: 処理A が残高を読む(1000円)
時刻 t2: 処理B が残高を読む(1000円)  ← 両方とも1000円を見ている
時刻 t3: 処理A が 800円を引いて書く(200円)
時刻 t4: 処理B が 800円を引いて書く(200円)  ← 本来は -600円のはず!

To prevent this, design it not to "read then write" but to conditionally increment/decrement with the database's atomic operation.

# read-modify-write をやめ、原子的 ADD + 条件式に置き換える
transact_write([
    {
        "Update": {
            "Key": {"account_id": account_id},
            "UpdateExpression": "ADD balance :delta",      # 原子的に増減
            "ConditionExpression": "balance >= :amount",   # 残高不足なら失敗
            "ExpressionAttributeValues": {":delta": -amount, ":amount": amount},
        }
    },
    # …同じトランザクションに冪等マーカー・履歴も含める
])

By expressing a balance floor (balance ≥ amount) or a charge ceiling (balance + amount ≤ limit) as a condition expression, you can structurally exclude, at design time, the balance sinking negative or being deducted twice under a race.

Furthermore, the handling of retries matters. Retry only transient races (optimistic-lock conflicts) with exponential backoff, and don't retry semantic failures like insufficient balance (the result is the same no matter how many times you try). Without this distinction, you load the system with wasteful retries or mistake the cause of the error.


4. The third pillar: recompute the amount server-side, and deduplicate webhooks

Don't receive the amount from the client

A surprisingly common vulnerability is using the amount sent from the client as-is for the payment. This is a textbook amount-tampering vulnerability where, if the user tampers with the request, they can "buy a 1000-yen product for 1 yen."

Always recompute the amount server-side from the order contents — this is the iron rule. In my project too, an early security audit detected exactly this "tampering with the client-specified amount," and I fixed it to a mechanism that resolves the amount server-side.

Webhooks arrive "at least once, out of order"

Webhooks from payment providers like Stripe can be delivered "at least once" and arrive out of order. That is, the same event can come multiple times, and an old event can arrive after a new one.

Against this:

  • Deduplicate resends with a unique constraint on the event ID (process the same event only once)
  • Compare the event's occurrence time to prevent the subscription state from rolling back when an old event arrives late
  • Mask the PII (card info, email, etc.) included in the raw webhook payload before saving

In my subscription-billing platform, I make webhook events idempotent with "event-ID unique constraint + ordering guarantee by occurrence time + PII redaction," simultaneously preventing resends, order reversal, and PII retention.


5. A "payment-system procurement checklist" for buyers

Let me distill the above into questions for buyers to discern vendors. When commissioning a system involving payments, the way they answer the following questions reveals the counterpart's reliability.

Procurement checklist

  • "How do you prevent double charges?" — can they answer with structure: "With an idempotency key and conditional write, we converge to once even on retry." "We'll be careful" / "We'll test it" is a danger sign.
  • "How do you guarantee balance consistency?" — can they explain atomic transactions and race control with condition expressions.
  • "Can you prevent amount tampering?" — can they immediately answer "we recompute the amount server-side."
  • "Do you handle webhook duplication and order reversal?" — have they designed event-ID deduplication and ordering guarantees.
  • "How does it behave on failure?" — can they explain the judgment (fail-closed / fail-open) of where to stop payments and where to fall to eventual consistency.
  • "How do you test and audit?" — unit tests of the payment logic (made into pure functions verifiable without a DB), and whether there's a security audit.

Payments is an area where "it works" doesn't cut it. Guaranteeing correctness with the code structure, and being able to articulate and explain it — this is the condition of a counterpart you can entrust a payment system to.

Why go this far: payments, once an incident occurs, loses not only monetary value but trust itself. That's exactly why I guarantee payment "correctness" not with operational rules or the carefulness of review but with the structure of idempotency, atomicity, and server-side amount resolution, and further build in observability (alerts), resilience (backup, DR), and least-privilege IAM. The 0 double charges during production operation is the result of this consistent design.


FAQ

Q. How do you prevent double charges?

Using a unique "idempotency key" issued by the client, judge "is this processing the first time" with a conditional write and charge only the first time. What's important is to execute this duplicate check and the actual charge atomically in the same transaction. With this, even if the same payment arrives multiple times due to network drops or retries, the charge converges to once. The point is to prevent it with the code structure, not operational care.

Q. Does using Stripe automatically prevent double charges?

Stripe itself has an idempotency-key mechanism, but unless you use it correctly and also design your own side's processing (balance update, webhook handling, inventory allocation, etc.) to be idempotent, double charges and inconsistencies can occur. Stripe is a powerful tool, but "use Stripe and it's automatically safe" is false. You need to design your own payment-reliability layer.

Q. When commissioning payment-system development, what should I confirm?

Ask "how do you prevent double charges," "how do you guarantee balance consistency," "can you prevent amount tampering," and "do you handle webhook duplication and order reversal." A counterpart who can answer with structure — "idempotency key," "atomic transaction," "recompute the amount server-side" — is trustworthy. A counterpart who can only say "we'll be careful" / "we'll cover it with tests" warrants caution.

Q. My existing payment system is causing double charges. Can it be fixed?

In most cases, yes. First isolate the cause (the retry path, webhook handling, balance-update races, etc.), introduce an idempotency key and atomic transactions, and structurally prevent recurrence. A design that migrates incrementally without stopping production (zero-downtime migration via double writes) is also possible. A change while payments are running is exactly where careful design is needed.


Summary: guarantee payment "correctness" with structure

To avoid incidents in payment/billing systems, here's what to grasp.

  1. Double charges and balance inconsistencies can't be prevented by "operational care" — prevent them with the structure of idempotency and atomicity.
  2. With an idempotency key + conditional write, converge the charge to once no matter how many times it arrives.
  3. With atomic transactions + condition expressions, don't cause balance inconsistency even under a race.
  4. Recompute the amount server-side, and deduplicate/order-guarantee webhooks by event ID.
  5. Buyers ask "how do you prevent it" and discern reliability by whether they can answer with structure.

New development of a payment/billing platform, fixing double charges/inconsistencies in an existing system, a design review for Stripe introduction — for any of these, I take it on from requirements definition through security and operations, at the same level that supported 0 double charges in production. If you're anxious about payment reliability, start with an isolation of the current issues.

友田

友田 陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Got a challenge?

From design to implementation and operations — solo × generative AI

Implementation like this article's, end to end from requirements to production. Start with a free 30-minute technical consult and tell me about your situation.

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

Also worth reading