METI Minister's Award winner | A B2B subscription SaaS that brought DX to the lumber-distribution industry
Successfully turned phone/fax/Excel analog trading into a single Web platform (SaaS)
Client
A lumber-distribution company (this product won the METI Minister's Award) | Industry: lumber supply chain (B2B distribution spanning forestry, markets, sawmills, precut plants, builders, and manufacturers) | Model: invite-only B2B subscription SaaS (marketplace payments via Stripe Connect)
My role
Technical architect and full-stack developer (single-handedly owning every stage of product development — from requirements and high-level design through frontend, backend, infrastructure, security-audit remediation, and operations).
Challenge (Situation & Task)
The lumber industry was extremely analog — orders ran on phone, email, fax, and Excel. Distribution links many roles in series — forestry, markets, sawmills, precut plants, builders, manufacturers — with complex commercial flows and tangled interests (industry constraints). Each role can perform different operations and view different information, and because trade crosses companies, tenant data isolation and payment validity had to be guaranteed at the same time. Historically you could only trade with existing, related partners, and information was closed off.
The lumber supply chain posed four characteristic challenges.
-
The industry's root problem: ordering ran mainly on phone/fax/Excel, with digitalization badly behind. Inventory lived in Excel and was never current; fax orders left no record, and phone confirmation took hours every day.
-
Multi-tier flow and role-based authorization: diverse roles connect — "forestry → market → sawmill → precut → builder → manufacturer" (plus composite attributes like "sawmill + precut" and view/manage roles) — and each role can use different features and see different scopes. Strict, per-API role-based authorization was therefore essential.
-
Cross-company data isolation (multi-tenancy): because companies search for and trade with one another, tenant isolation had to prevent a counterparty's PII (email, phone, corporate number, etc.) leaking through cross-company search and viewing.
-
Payment validity and idempotency: with recurring billing plus per-transaction settlement on a marketplace (Stripe Connect), amount tampering, double charges, and duplicate webhook processing were unacceptable — idempotency and consistency had to be guaranteed.
Why these technologies (Rationale)
Python 3.11 / Flask / SQLAlchemy 2.0 / Marshmallow 3 / PostgreSQL 16: ideal for document generation, Excel ingestion, and complex multi-tier flow logic. Strict layering (Router → UseCase → Repository → Model) guarantees ease of change (ETC) and single responsibility (SRP).
React 18 / TypeScript 5.9 / Vite 7 / MUI 7 / TanStack Query 5: chosen for complex per-role UI control and large-scale data operations. Type-safety discipline banning
any/letwith Zod boundary validation, and lazy-loading across 107 routes to minimize the initial bundle.AWS (ECS Fargate / RDS / Cognito / Lambda / CloudFront / S3 / SQS / EventBridge / DynamoDB): a scalable SaaS foundation. Heavy/async work (webhooks, documents, Excel ingestion) is split into Lambda and made event-driven.
Terraform (100% IaC): VPC through Cognito, ECS, RDS, CloudFront, and 12 Lambdas codified in 17 modules. Guarantees reproducibility, reviewability, cost control, and zero-downtime migration, with state managed via native S3 locking.
AWS Cognito (JWT RS256): the auth foundation for 7 roles plus view/manage roles. Every endpoint is protected by the API Gateway authorizer, with role-based authorization enforced strictly at the router layer.
Stripe Connect: marketplace payments supporting recurring billing, transaction settlement, and bank transfers. Amounts are resolved server-side, webhooks are idempotent, and billing adjustments are applied reliably via a transactional outbox.
What I did (Action)
[Role-based multi-tenant authorization] Role codes are defined as per-feature
frozensetwhitelists and authorization is centralized at the router layer (a mismatch returns 403, suppressing ID-enumeration attacks). Cross-company search/view is restricted to a PII-free public schema (UserPublicSchema), implementing a two-layer schema boundary so a counterparty's email/phone/corporate number never leaks.[Auth foundation: Cognito RS256 + JWKS] All 221 endpoints are protected by the API Gateway authorizer. The backend verifies JWTs with RS256, requiring
exp/iat/iss/aud/token_useand rejecting anything buttoken_use==id. JWKS is cached in a double-checked-lock singleton and refreshed every 6 hours (pre-warmed at startup).[Idempotent payments: Stripe Connect] Subscription state is held on
User, with Stripe ID formats validated by DB CHECK constraints (cus_/sub_/acct_). Amounts are resolved server-side to eliminate tampering. Webhooks are split across 3 Lambdas and deduplicated with DynamoDB conditional writes (attribute_not_exists, 30-day TTL). Billing adjustments are applied reliably via a transactional outbox + reconciliation Lambda (EventBridge scheduled).[Parallelizing heavy work] Excel/PDF for orders, delivery notes, and invoices are generated concurrently with
ThreadPoolExecutor(a dedicated app context per thread,selectinloadto avoid N+1,with_for_updateto prevent contention). Excel→DB ingestion is bulk-loaded by an S3-event-driven Lambda with a 50MB cap and formula-injection neutralization (CWE-1236). The frontend uses exponential-backoff + Page-Visibility-aware polling to curb API waste on background tabs.[DB efficiency and reliability] Added 48 missing FK indexes online with
CREATE INDEX CONCURRENTLY. The connection pool is sized(5+5)×8 tasks < RDS limitto prevent exhaustion. Made the daily-report N+1 (a flush proportional to the number of sites) constant. Tests run in ~11 seconds with savepoint isolation.[4 rounds of security audit] Static audit → live assessment → black-box + white-box penetration testing. A third-party pen test with 15 real roles (R4) proved 0 missing-authorization findings across all 221 endpoints. All Critical/High findings — cross-tenant PII leakage, payment-amount tampering, plaintext credentials (→ moved to Secrets Manager), and a fail-open webhook idempotency bug (→ fail-closed) — were closed. Security headers (HSTS/CSP/X-Frame-Options, etc.) were also put in place.
[Observability and resilience] Structured logs + Slack notifications on ERROR (classifying permanent vs. transient failures and retrying with exponential backoff). Slack-delivery failures are detected via a single-line JSON marker to prevent log-injection alarm spoofing, escalating through a CloudWatch metric filter → SNS email — a Slack-independent path. The marker contract is kept mechanically in sync across backend / Lambda / Terraform with contract tests.
[IaC, CI/CD, quality gates] GitHub Actions OIDC (no long-lived keys) drives ECR → ECS forced deploys plus S3 sync + CloudFront invalidation. Terraform automates plan (tfsec on PRs) / apply (a permission-bounded role) / drift detection (scheduled cron → file an issue). Two-stage pre-commit/pre-push gates run ruff, mypy, Bandit, Vulture, deptry, pip-audit, ESLint, Prettier, tsc, npm audit, Trivy, and gitleaks, with Dependabot keeping 6 ecosystems continuously updated.
Because this product sits at the intersection of B2B, payments, and multiple companies' data, it was designed security- and consistency-first.
Tenant isolation and PII protection: a counterparty's details are returned in the detailed schema only when a trading relationship exists; cross-company search/view is restricted to a PII-free public schema. When a third-party pen test found cross-tenant PII leakage (B-1, HIGH), it was fixed and re-assessed the same day to confirm 0 findings.
Payment idempotency: amounts are resolved server-side, requests to Stripe use content-addressed idempotency keys, and webhooks are deduplicated by a separate Lambda + DynamoDB conditional writes. Billing adjustments use an outbox + reconciliation so that double charges and missed updates are prevented even under network drops and retries.
Heavy work and UX: document generation runs across threads, Excel ingestion is split into an event-driven Lambda, and the frontend waits for completion with exponential-backoff + visibility-aware polling — so heavy work never degrades the admin UI's responsiveness.
A foundation that holds up in production: infrastructure is 100% Terraform, reproducing staging/production from code. During downtime a single environment_active flag collapses billable resources to zero, and Graviton / Fargate Spot / a single NAT optimize steady-state cost.
Key technical decisions
React 18 + TypeScript 5.9 + TanStack Query 5: type-safe UI and efficient data fetching/caching
Flask + SQLAlchemy 2.0 + Marshmallow 3: layered architecture with strict boundary validation
AWS ECS Fargate + Cognito + Lambda + Terraform: a reproducible, event-driven SaaS foundation via IaC
Stripe Connect + DynamoDB: idempotent marketplace payments and webhook deduplication
PostgreSQL 16: representing complex multi-tier flows relationally
Responsibilities
- Requirements & architecture design
- Frontend development (React / TypeScript / MUI)
- Backend development (Python / Flask / SQLAlchemy)
- Infrastructure build & operations (AWS / Terraform / IaC)
- Database design (PostgreSQL)
- Security audit & penetration-test remediation
- CI/CD & quality gates (GitHub Actions)
Technologies
Results in numbers
- Missing-authorization findings
- 0findingsAll 221 endpoints protected by Cognito authorization. A third-party pen test with 15 real roles (R4) proved 0 auth-bypass findings.
- Security audits
- 4roundsStatic audit → live assessment → black-box + white-box pen test. All Critical/High findings (cross-tenant PII leakage, payment tampering, etc.) closed.
- Automated tests (backend)
- 2,153testsAll run in ~11 seconds with savepoint isolation. Covers the Router / UseCase / Schema layers.
- DB migrations
- 204generationsVersioned with Alembic. 48 FK indexes applied online via CREATE INDEX CONCURRENTLY.
- Terraform modules
- 17modulesVPC, Cognito, ECS, RDS, CloudFront, and 12 Lambdas — infrastructure 100% codified (IaC).
- Production API endpoints
- 221endpointsBuilt with Flask-RESTful. Strict 4-layer separation (Router→UseCase→Repository→Model) for ease of change.
Results
- [The biggest outcome] The client won the "METI Minister's Award" with this product
- Public certification: the product earned Kyoto Prefecture certification and runs as a government-recognized, high-trust DX foundation
- Industry DX realized: successfully turned analog phone/fax/Excel trading into a single Web platform (ordering implemented as a 9-state state machine)
- New business opportunity created: breaking from closed trading to a marketplace where companies can search for and trade with one another (a sawmill can trade without visiting the market)
- Improved traceability and trust: origin certification (legal timber), order-linked chat, and 0–5 company ratings enable transparent trading
- Operational efficiency: one-click thread-parallel Excel/PDF generation for "quotes / delivery notes / invoices," with existing Excel auto-ingested into the DB by an S3-event-driven Lambda
- Proven security: a third-party penetration test (15 real roles) confirmed 0 missing-authorization findings across all 221 endpoints. All Critical/High findings — cross-tenant PII leakage, payment tampering, etc. — were closed
- Idempotent marketplace payments: Stripe Connect + DynamoDB webhook deduplication and a transactional outbox prevent double charges and missed updates
- A production-grade foundation: infrastructure is 100% Terraform (17 modules, 12 Lambdas), GitHub Actions OIDC auto-deploys, and structured logs with multi-layered Slack/CloudWatch alerting ensure observability
同様の課題、抱えていませんか?
あなたのビジネス課題も、最新の技術で解決できます。 まずは30分の無料技術相談から、状況をお聞かせください。
自社の課題もSaaS化できるか相談するプロジェクト単位(請負)・技術顧問、どちらにも対応可能です