Subscription revenue doesn't leak only from the cancel button. More quietly, every month, it keeps leaking from card expiry and temporary payment failures. The user didn't mean to leave, but the card expired and billing stopped, and before they knew it they lost access — this is involuntary churn. What's troublesome is that this happens not from "dissatisfaction with the service" but just a gap in operational design. That is, it's revenue that, for the most part, can be recovered.
This article is an implementation guide to stopping this quiet revenue leak with Stripe. All the behaviors are backed by the Stripe official documentation, and I make the decision axes (when, how, why) thicker than the official. I myself implemented Stripe Webhook idempotency, ordering guarantees, PII redaction, and the state machine of bank-transfer subscriptions into a Next.js 16 monorepo on a subscription learning platform for financial-literacy education, and on a METI-award-winning forestry-DX SaaS I handled B2B subscriptions via Stripe Connect. The "how to receive a payment failure" that took effect in that implementation, I generalize and leave here.
The scope of this article: I don't re-explain the "basics" of Webhook signature verification or idempotency. That's already separated into the sister article The Complete Guide to Implementing Stripe Payments at Production Quality (Idempotency-Key, raw-body signature verification, the 2-layer idempotency model). The overall picture of subscription components, usage-based billing, proration, and the customer portal is in the Stripe Billing Implementation Guide, and the monorepo dissection of a real product is in Dissecting the Architecture of a Subscription Learning Platform. This article avoids overlap and concentrates on the single point of "revenue recovery from payment failures, dunning, and suppressing involuntary churn."
The baseline version: the current series of the Stripe Node SDK (the
stripepackage). The SDK auto-pins to the API version at release time, and the TypeScript types are consistent with that API version (official: API versioning). The samples assume Next.js 16 (App Router) + TypeScript strict.
0. The overall picture: voluntary churn vs involuntary churn
Churn (cancellation, departure) has 2 kinds of differing nature. Confuse these and you get the move wrong.
| Kind | Trigger | Main cause | The correct move |
|---|---|---|---|
| Voluntary churn | The user decides to "quit" | Insufficient value, price, switching | Product improvement, collecting cancellation reasons (Customer Portal's cancellation_details) |
| Involuntary churn | The user wants to continue but billing stops | Card expiry, insufficient balance, temporary payment failure, incomplete SCA | This article's theme: retries, dunning, automatic card update, grace period |
Voluntary churn is a product problem, and improvement takes time. On the other hand, involuntary churn can mostly be recovered with the operation of "retrying the payment and notifying the user" — and the implementation is finite, and once built, it takes effect permanently. This is the most cost-effective. Stripe systematizes this domain as Revenue Recovery. This article is the map for using up this official feature meshed with your own app-side implementation.
The components of revenue recovery that Stripe provides are, in the official documentation, the following 5.
- Smart Retries — auto-retry judging with machine learning "when retrying is more likely to succeed"
- Dunning emails (customer emails) — auto-sent to the user with an update link on payment failure / card expiry
- Automatic card update (Card Account Updater) — Stripe auto-updates the number on card reissue
- Revenue-recovery analytics — visualization of the failure rate, recovery rate, and recovered amount
- No-code automation (Automations) — segment-specific dunning flows
Of these, the line of where code is/isn't needed is the crux of the design. Lean what can be leaned onto the Stripe side (Dashboard), and what the app should hold is "only the access-right (entitlement) judgment" — this is the design policy running through this article.
1. The payment-failure state machine: active → past_due → unpaid/canceled
The starting point of everything is to accurately know how a subscription's state transitions on payment failure. The access-right judgment rides on top of this status (official: Subscription statuses).
┌──────────── payment succeeds ───────────┐
▼ │
(new) incomplete ──unresolved within 23h──▶ incomplete_expired (activation failed, not charged)
│
first success
▼
┌──────────▶ active ◀──────────┐
│ │ │ succeed during failure and it recovers
cancel reserved billing fails │
│ (period end) ▼ │
│ past_due ────────────┘
│ │ (Smart Retries retries during this)
│ retries exhausted → branches by Dashboard setting
│ ┌──────┼──────────┐
▼ ▼ ▼ ▼
canceled canceled unpaid stays past_due
Here the most important distinction is past_due and unpaid.
past_due: the most recent finalized invoice's payment failed, or wasn't attempted. The subscription is alive, and Smart Retries keeps retrying while in this state. A phase where "recovery is still possible."unpaid: still unpaid even after exhausting retries. Stripe no longer attempts payment. The official clearly states "onunpaid, revoke access to the product (because attempts and retries are done by thepast_duestage)."
And first-charge failure is a separate system. If the first payment doesn't go through when creating a new subscription, it becomes incomplete, and if not resolved within 23 hours, it becomes incomplete_expired (a final state, no charge occurs). You need to handle this separately from the ongoing active → past_due.
1-1. The status → app behavior correspondence table (this is the core of the design)
The design to reduce involuntary churn ultimately consolidates into this table. For each status, uniquely decide "allow access / give grace / notify what."
subscription.status | Access | Grace period | App behavior | Notification |
|---|---|---|---|---|
trialing | Allow | — | Normal provision (trial) | Trial-end preview (Stripe sends 7 days before) |
active | Allow | — | Normal provision | None |
past_due | Allow (grace) | Yes (retry period) | Keep letting them use features, and prompt "update payment" with an in-app banner | Dunning email + in-app banner |
incomplete | Reject (not active) | 23 hours | First charge incomplete. If waiting on SCA, to the authentication path | Payment-confirmation link |
unpaid | Revoke | None | Retries exhausted. Revoke access, present a recovery path (portal) | Final notice + portal guidance |
canceled | Revoke | None | Ended. Re-contract path | Cancellation confirmation |
incomplete_expired | Revoke | — | Activation failed. Treat as a new contract | — |
paused | (Depends on policy) | — | No payment method after trial, etc. Billing stopped | Prompt to add a payment method |
The critical point of the design is the past_due row. Here, if you cut access immediately, you throw away revenue you could have recovered with your own hands (the most frequent accident. Detailed in §8). past_due is a phase of "give grace and wait for the retry." On the other hand, once you reach unpaid, retries are exhausted, so revoking access is the judgment in line with the official.
1-2. collection_method: auto-charge, or send an invoice
As a premise of the state transitions, grasp that there are 2 kinds of collection methods (official: collection_method).
collection_method | Behavior | Retry / dunning |
|---|---|---|
charge_automatically (default) | Auto-charge from the saved payment method | Smart Retries and dunning emails take effect (the main target of this article) |
send_invoice | Email an invoice link, the user pays manually | No auto-retry. Dun with days_until_due and unpaid reminders |
A card-charge-centric SaaS is basically charge_automatically. Invoice-based (B2B monthly closing, etc.) is send_invoice. Where revenue-recovery automation takes effect most is charge_automatically. This article proceeds mainly on this.
2. An idempotent payment-failure Web handler (replay-safe TS code)
To treat status as the source of truth, update the app-side access right via a Webhook from Stripe. What takes effect here is the 2-layer idempotency detailed in the sister article (dedup with event.id + reject out-of-order with event.created). I leave the basics to the Stripe Payments Production Guide and handle here only the events needed for recovery.
The events to watch are as follows (official: Subscription webhooks).
| Event | Firing timing | App behavior |
|---|---|---|
invoice.payment_failed | An invoice payment failed | Branch by looking at next_payment_attempt. If a retry is scheduled, grace; if not, prepare for revocation |
invoice.payment_action_required | Additional authentication like SCA/3DS is needed | Guide the user to the authentication path (portal/confirmation link). Don't cut access |
invoice.paid | Payment succeeded | (Re-)grant access. Record as a recovery event |
customer.subscription.updated | status change (becoming past_due/unpaid, etc.) | Recompute the entitlement according to status (the sole source of truth) |
customer.subscription.deleted | Subscription ended | Immediately revoke access |
2-1. Narrow Stripe events with Zod at the boundary
Stripe's types are a wide union. The iron rule is to narrow with Zod only the fields you need at the boundary, and handle only safe types thereafter (avoid type escapes).
// lib/billing/recovery-events.ts
import { z } from "zod";
// past_due 判定に必要な最小フィールドだけを抜き出す
const InvoiceShape = z.object({
id: z.string(),
customer: z.string(),
subscription: z.string().nullable(),
// 次のリトライ予定(null = もうリトライしない=枯渇のサイン)
next_payment_attempt: z.number().int().nullable(),
attempt_count: z.number().int(),
status: z.enum(["draft", "open", "paid", "uncollectible", "void"]),
});
const SubscriptionShape = z.object({
id: z.string(),
customer: z.string(),
status: z.enum([
"trialing", "active", "past_due", "incomplete",
"incomplete_expired", "unpaid", "canceled", "paused",
]),
cancel_at_period_end: z.boolean(),
current_period_end: z.number().int(),
collection_method: z.enum(["charge_automatically", "send_invoice"]),
});
export type RecoveryInvoice = z.infer<typeof InvoiceShape>;
export type RecoverySubscription = z.infer<typeof SubscriptionShape>;
export const parseInvoice = (raw: unknown): RecoveryInvoice =>
InvoiceShape.parse(raw);
export const parseSubscription = (raw: unknown): RecoverySubscription =>
SubscriptionShape.parse(raw);
Why not use the raw
Stripe.Invoicedirectly. The SDK's type is huge, and there's a risk of reading past a "sign of exhaustion" likenext_payment_attempt. Declare only the intended fields at the boundary, and the handler's responsibility (SRP) narrows to the single point of "decide access with this field group."
2-2. "Derive" the access right from status (don't save it)
The biggest design judgment is "don't double-hold the entitlement (right) in the DB." The source of truth for access is Stripe's subscription.status. The app DB holds it only as a cache. This structurally eliminates the inconsistency bug of "a discrepancy between the DB and Stripe."
// lib/billing/entitlement.ts
import type { RecoverySubscription } from "./recovery-events";
export type AccessLevel = "full" | "grace" | "revoked";
// status → アクセスレベルへの「純粋関数」。§1の表をコードに落とす
export function resolveAccess(sub: RecoverySubscription): AccessLevel {
switch (sub.status) {
case "trialing":
case "active":
return "full";
// past_due はリトライ中。猶予でアクセスは維持(=回収のチャンスを残す)
case "past_due":
return "grace";
// unpaid はリトライ枯渇 → 公式に従いアクセス取り消し
case "unpaid":
case "canceled":
case "incomplete":
case "incomplete_expired":
return "revoked";
// paused は業務方針次第(ここでは安全側=取り消し)
case "paused":
return "revoked";
default: {
// 網羅性検査:statusに値が増えたらコンパイルエラーで気づく
const _exhaustive: never = sub.status;
return _exhaustive;
}
}
}
With the exhaustiveness check via never, even if Stripe adds a status in the future, this becomes a compile error and prevents a missed response. This is a practical technique to turn type safety into "insurance for your future self."
2-3. The body of the idempotent failure handler
// app/api/stripe/webhook/route.ts(抜粋・回収関連の分岐のみ)
import { parseInvoice, parseSubscription } from "@/lib/billing/recovery-events";
import { resolveAccess } from "@/lib/billing/entitlement";
// 注:署名検証・raw body取得・event.id重複排除・event.created順序チェックは
// 基礎記事の2層モデルに従い、ここでは「分岐の中身」だけを示す。
async function handleRecoveryEvent(event: import("stripe").Stripe.Event) {
switch (event.type) {
case "invoice.payment_failed": {
const invoice = parseInvoice(event.data.object);
if (!invoice.subscription) break;
if (invoice.next_payment_attempt !== null) {
// ★ まだリトライ予定がある=past_dueで猶予を与える局面。
// アクセスは切らず、ユーザーに「カード更新を」促すだけ。
await markPaymentAtRisk(invoice.subscription, {
attemptCount: invoice.attempt_count,
nextAttemptAt: invoice.next_payment_attempt,
});
} else {
// ★ next_payment_attempt が null = リトライ枯渇のサイン。
// 最終通知を送り、subscription.updated での unpaid 化に備える。
await markRecoveryExhausted(invoice.subscription);
}
await trackEvent("payment_failed", { invoice: invoice.id });
break;
}
case "invoice.payment_action_required": {
// SCA/3DS。アクセスは切らない。認証導線へ誘導する通知だけ。
const invoice = parseInvoice(event.data.object);
if (invoice.subscription) {
await notifyActionRequired(invoice.subscription);
await trackEvent("payment_action_required", { invoice: invoice.id });
}
break;
}
case "invoice.paid": {
// 回収成功 or 通常更新。どちらも status 基準で権限再計算するのが安全。
const invoice = parseInvoice(event.data.object);
await trackEvent("payment_recovered", { invoice: invoice.id });
break;
}
// 真実源:status が変わったら entitlement を再計算(唯一の権限更新点)
case "customer.subscription.updated":
case "customer.subscription.deleted": {
const sub = parseSubscription(event.data.object);
const access = resolveAccess(sub);
await upsertAccessCache(sub.customer, sub.id, access, sub.status);
break;
}
}
}
Three design points.
- The "sole point" of permission update is
customer.subscription.updated/deleted. Onpayment_failed, don't touch permissions directly, but only "raise a danger flag / notify." This way, even in a situation where "the failure event and the subscription-update event arrive out of order," the final permission is always derived fromstatusand doesn't roll back (order-independent). - Judge grace or exhaustion by
next_payment_attempt's null/non-null. This is the official signal by which the app side knows "should I still wait for recovery." - Record all failures, authentication requirements, and recovery successes with
trackEvent(the foundation of §7's measurement). Don't swallow them.
When using Automations, there's an official caveat that
invoice.payment_failedno longer setsnext_payment_attempt, and insteadinvoice.updatedcarries it (Smart Retries). When introducing Automations, also subscribe toinvoice.updated.
3. Smart Retries, or a custom retry schedule
The retry strategy has 2 choices. Stripe's recommendation is clearly Smart Retries (official).
| Viewpoint | Smart Retries (recommended) | Custom retry schedule |
|---|---|---|
| Timing decision | Chooses a time with a high success rate via machine learning (the device's usage, the per-country optimal time, etc.) | A fixed rule (specify by hand "N days after the last") |
| Max count | Default 8 times / 2 weeks (configurable 1 week ~ 2 months) | Max 3 times |
| Code | Not needed (Dashboard setting) | Not needed (Dashboard setting) |
| Per-segment differentiation | Possible with Automations | Limited |
| Official evaluation | "Far more effective than scheduled retries" | Less effective than Smart Retries |
The setting is in the Dashboard's Billing → Revenue Recovery → Retries. Unless there's a special reason, it's Smart Retries, no question. A fixed schedule is justified only in very limited cases, like "wanting to strictly fix the retry date" for compliance or accounting reasons.
And the final behavior after retry exhaustion is also set here. This decides the terminal of §1's state machine.
| Setting | status after exhaustion | Meaning |
|---|---|---|
| Cancel subscription | canceled | Treated as cancellation (re-contract needed) |
| Mark as unpaid | unpaid | Keep the subscription but revoke access. Can recover if paid later |
| Leave past due | Stays past_due | Billing continues but no new auto-retry |
Which to choose is a trade-off with ease of recovery. unpaid "locks them out while keeping their seat," so if the user fixes the card later, they can return (on the same subscription) — in B2B SaaS this is often advantageous for recovery. On the other hand, canceled ends completely and needs to go through a re-contract flow.
Note that for hard decline reasons (
lost_card,stolen_card,authentication_required, etc.) or when there's no payment method, Stripe doesn't retry in the first place (non-retry targets). When "it should be retrying but isn't," first suspect this hard decline. In this case, recovery bets not on retries but on dunning (the user's own card update).
4. Dunning emails and the customer portal: have the user fix the card themselves
If retries are "machine-side recovery," dunning is "human-side recovery." Card expiry or insufficient balance ultimately won't be resolved unless the user registers a new card. Receiving the card number with your own form here is the worst move (you bear all of the PCI burden, SCA handling, and i18n). Having Stripe host it is the only correct design.
4-1. The dunning emails Stripe auto-sends
Enable it in the Dashboard's Settings → Revenue Recovery → Emails, and Stripe auto-sends the following (official: Customer emails). Each email contains a link to the card-update page.
- Payment-failure notification (with the reason. e.g., card expiry) — with a link to the update page
- Advance notification of card expiry (about 1 month before the registered card's expiry)
- Payment-confirmation notification (when the user's confirmation is needed, like 3DS/SCA or Boleto) — a Stripe-hosted confirmation link
- Update reminder (before the next billing date)
That is, the flow of "telling the user that payment failed and having them update the card" can be left to Stripe without writing a single line of code. What the app side does is just to also show a banner in-app to the past_due users flagged in §2, doubling the path.
Links have a lifespan. When the subscription becomes
canceled,incomplete_expired, orunpaid, or the current update period passes, etc., the in-email link expires (official). That's exactly why you separately prepare, in-app, an always-valid path to the Customer Portal.
4-2. The path to the customer portal (SCA/3DS re-authentication also completes here)
The destination you fly to from the in-app banner is the Customer Portal. Card update, immediate payment of unpaid invoices, and 3DS re-authentication all complete in the Stripe-hosted UI. The app just creates a session and redirects.
// app/api/billing/portal/route.ts
import Stripe from "stripe";
const stripe = new Stripe(process.env.STRIPE_SECRET_KEY!);
// past_due のユーザーが「支払い方法を更新」を押したら呼ぶ
export async function POST(req: Request) {
const customerId = await getCustomerIdForSession(req); // 認証済みユーザーから解決
if (!customerId) return new Response("Unauthorized", { status: 401 });
const portal = await stripe.billingPortal.sessions.create({
customer: customerId,
return_url: `${process.env.APP_URL}/account/billing`,
});
// カード番号はアプリを一切経由しない(PCI負担をStripeに委譲)
return Response.json({ url: portal.url });
}
An example of the in-app banner shown to past_due users. The trick to raising the recovery rate is to keep letting them use features (grace) and only show a prominent path.
// components/billing/payment-at-risk-banner.tsx(Server Component)
import { resolveAccess } from "@/lib/billing/entitlement";
import { getCachedSubscription } from "@/lib/billing/cache";
export async function PaymentAtRiskBanner({ customerId }: { customerId: string }) {
const sub = await getCachedSubscription(customerId);
if (!sub || resolveAccess(sub) !== "grace") return null; // past_due のときだけ表示
return (
<div role="alert" className="rounded-md border border-amber-300 bg-amber-50 p-4">
<p className="font-medium text-amber-900">お支払いを確認できませんでした</p>
<p className="text-sm text-amber-800">
カードの有効期限切れ等が考えられます。サービスは引き続きご利用いただけますが、
お早めに支払い方法をご更新ください。
</p>
{/* /api/billing/portal を叩いて Stripe ホストのポータルへ */}
<form action="/api/billing/portal" method="post">
<button className="mt-2 rounded bg-amber-600 px-3 py-1.5 text-white">
支払い方法を更新する
</button>
</form>
</div>
);
}
The SCA/3DS pitfall:
invoice.payment_action_requiredis a state where "the card is valid, but the bank is requiring additional authentication." You must not cut access here. What you should do is guide the user to complete authentication. Complete therequires_actioninvoice via Stripe's dunning email (the payment-confirmation link) or the portal above, andinvoice.paidfires, and recovery is complete. SCA is one of the biggest factors, especially in Europe, where "a legitimate user's payment temporarily stops," and if you mistake this for involuntary churn and cut access, you lose your best customers with your own hands.
5. Grace period and permission control: respect past_due
The skeleton is done with §1's table and §2's resolveAccess. Here, let me work out the operation of grace.
It's natural to make the grace-period length effectively match Smart Retries' retry period (default 2 weeks). Because retries are running during past_due, maintaining access during that time gives both "machine recovery" and "human recovery (dunning)" time. The moment retries are exhausted and it falls to unpaid, customer.subscription.updated fires, resolveAccess returns revoked, and access is cut automatically — not holding the period management in the app but entrusting it to Stripe's state transitions is the most break-resistant design.
The handling of cancel_at_period_end is also clear. This is a reservation of voluntary churn (the user chose to cancel at period end) and is unrelated to payment failure. Even with cancel_at_period_end === true, while status is active, maintain access (they can use what they paid for). Only when status changes to canceled at period end is it revoked — here too you can judge with status alone.
// 猶予判定の一例:past_due でも「いつまで猶予するか」を可視化したい場合。
// 期間の真実は Stripe 側(リトライ設定)にあるので、UI表示のための補助に留める。
export function graceContext(sub: RecoverySubscription) {
const isGrace = sub.status === "past_due";
return {
isGrace,
// 表示用:現在の課金期間末。これを過ぎても unpaid 化は Stripe の設定次第。
periodEnds: new Date(sub.current_period_end * 1000),
// cancel_at_period_end は支払い失敗とは別軸(自発的解約の予約)
willCancelAtPeriodEnd: sub.cancel_at_period_end,
};
}
Anti-pattern: an implementation that holds a "grace of N days" timer in the app itself and counts down independently from the
past_duedetection. It drifts from Stripe's retry period, and "Stripe is still retrying but the app cuts access first" (= loss of recovery opportunity) or the reverse (= they can use it without paying) occurs. The correct answer is to follow thestatustransition for the terminal of grace.
6. Measuring recovery: hold recovered MRR as a "fact"
If you can't measure "how much you recovered," there's no way to improve. But fabricating numbers is strictly forbidden. Stripe's revenue-recovery analytics (Dashboard) officially aggregates the failure rate, recovery rate, and recovered amount. This is the primary source.
On the app side, by leaving the trackEvent set up in §2 in your own analytics platform, you make it cross-checkable with Stripe's aggregation. What matters is to observe "failure" and "recovery" as a pair.
// lib/billing/recovery-metrics.ts
type RecoveryEvent =
| { kind: "payment_failed"; invoiceId: string; at: number }
| { kind: "payment_action_required"; invoiceId: string; at: number }
| { kind: "payment_recovered"; invoiceId: string; at: number };
// invoice 単位で「失敗→回収」が成立したかを後から照合できる形で記録する。
// 金額や率はここで「計算」せず、生イベントとして残す(捏造の余地を作らない)。
export async function recordRecoveryEvent(e: RecoveryEvent): Promise<void> {
await analytics.append("billing_recovery", {
kind: e.kind,
invoice_id: e.invoiceId,
occurred_at: new Date(e.at * 1000).toISOString(),
});
}
The recovery rate can be calculated from facts as "the proportion where, starting from a payment_failed, a payment_recovered was raised for the same invoice within a fixed period." Don't put estimated values on it — this is the same discipline in both a tech article and a real product's report. Not speaking of unverified numbers like the absolute MRR amount or the improvement width of the churn rate becomes long-term trust (in my projects too, I put only confirmed facts in the report).
Another level of observability is to look at the distribution of invoice.payment_failed's attempt_count. "Is it recovered on the 1st time / does it take many times" reflects the quality of the cards and the effectiveness of the dunning copy.
7. Common pitfalls (implementations that kill recovery)
The patterns I've seen in the field / nearly stepped on myself that ruin recovery.
- Immediate access cutoff on the first failure. The most frequent and the worst.
past_dueis the phase of "retrying = a recovery chance." Cut it here, and you throw away revenue Stripe could have recovered with your own hands.past_dueis grace,unpaidis revocation. - Ignoring or treating as failure
invoice.payment_action_required(SCA/3DS). The card is valid, just waiting on authentication. Cut access and you lose your best customers. Guiding to the authentication path is the correct answer. - A non-idempotent failure handler. Webhook double-delivery and out-of-order are normal cases. Touch permissions directly on
payment_failed, and the later-arrivingsubscription.updatedrolls it back — producing this inconsistency. Consolidate permission updates to the single point ofsubscription.updated/deletedand make it idempotent withevent.id/event.created(the basics are in the sister article). - Doing no dunning at all. With retries (machine) alone, card expiry isn't fixed. Without the user's own card update (dunning + portal), you miss expiry-derived churn wholesale.
- Receiving the card number with your own form. You shoulder all of PCI, SCA, and i18n, and the recovery rate doesn't rise either. Delegate to the Customer Portal / a Stripe-hosted page.
- Self-making a grace timer on the app side and drifting from Stripe's retry period (§5). Follow the
statustransition for the terminal of grace. - Expecting
charge_automatically-premised automation on asend_invoicesubscription. Smart Retries doesn't take effect on the latter. Split the recovery strategy per collection method. - Not measuring recovery / speaking of fabricated numbers. It's a choice of either being unable to improve, or losing trust. Observe and report only facts (§6).
Summary: involuntary churn is "revenue leakage you can close with design"
Much of subscription revenue leakage is not cancellation from dissatisfaction but an operational accident called card expiry and temporary payment failure. And it can be structurally closed by meshing Stripe's official features with the correct app-side permission design. The key points in 5 lines.
- Make the state machine the source of truth. Understand
active → past_due → unpaid/canceled, and strictly observepast_due= grace (in recovery) /unpaid= revocation. Derive access fromstatusand don't double-hold it in the DB. - Consolidate permission updates to the single point of
customer.subscription.updated/deleted. Onpayment_failed, only flags and notifications. Idempotent and order-independent withevent.id/event.created. - Retries are Smart Retries (default 8 times / 2 weeks), no question. Match the terminal after exhaustion (
unpaidrecommended) and the grace period. Detect exhaustion bynext_payment_attempt's null. - With dunning + Customer Portal, have the user fix the card themselves. SCA/3DS is "waiting on authentication," not failure — don't cut it, guide to the path. Don't hold the card number yourself.
- Measure recovery with facts. Observe
payment_failed↔payment_recoveredas a pair and cross-check with Stripe's analytics. Don't speak of unverified numbers.
"With one person × generative AI (Claude Code), fast, cheap, and safe" building a payment platform — its real example is the subscription learning platform for financial-literacy education, the source of this article's code. For consultation on Stripe subscription design, revenue recovery from payment failures, and dunning implementation, please reach out via Contact.