7 lessons in B2B SaaS development learned from a METI-Minister's-Award-winning product
Introduction: breaking away from phone, fax, and Excel
Two years ago, I took on the DX of the lumber-distribution industry — "an extremely analog industry." Ordering by phone, email, fax, and Excel was the norm; inventory information was managed in Excel; orders by fax left no record; and confirmation work by phone took several hours every day — I built this kind of situation as a B2B subscription SaaS that manages it centrally on the web.
As a result, this product won the METI Minister's Award.
In this article, I disclose the seven lessons learned through this development, from the viewpoints of technology selection, architecture design, security, and scalability. I believe it can provide practical suggestions especially for those considering B2B SaaS development.
Lesson 1: Select technology by working backward from "the industry's complexity"
The challenge: 8 user attributes and a complex distribution flow
The lumber-distribution industry has eight kinds of user attributes: "forestry," "market," "sawmill," "precut," "construction company," "manufacturer," "wholesaler," and "other." Since the executable functions and viewable information differ per attribute, strict authentication and authorization were mandatory.
The decision: AWS Cognito + custom logic
Initially I also considered Firebase Authentication, but chose AWS Cognito for the following reasons:
// AWS Cognito User Poolsのカスタム属性で8種類のユーザー属性を管理
{
"custom:user_type": "lumber_mill", // 製材所
"custom:permissions": "create_order,view_inventory",
"custom:company_id": "company-123"
}
Selection reasons:
- Flexibility of custom attributes: can define different permissions per each of the 8 user attributes
- AWS ecosystem: easy integration with ECS, RDS, Lambda
- Scalability: MAU billing, suppressing the initial cost
Lesson: Technology selection should give the highest priority to "the industry-specific complexity." Choose technology that can reflect the industry's domain knowledge, not a generic solution.
Lesson 2: Design security "per page, per API"
The challenge: minimizing data-leak risk
In B2B SaaS, competitors' data exists within the same system. A situation where "sawmill A can view market B's inventory information" is a fatal business risk.
The implementation: a defense-in-depth architecture
# Flask + AWS Cognito統合による認証・認可
from functools import wraps
from flask import request, jsonify
def require_user_type(*allowed_types):
"""ユーザー属性ごとのアクセス制御デコレータ"""
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
user_type = get_user_type_from_token(request.headers.get('Authorization'))
if user_type not in allowed_types:
return jsonify({'error': 'Forbidden'}), 403
return f(*args, **kwargs)
return decorated_function
return decorator
@app.route('/api/inventory', methods=['GET'])
@require_user_type('lumber_mill', 'market', 'manufacturer')
def get_inventory():
"""在庫情報取得(製材所・市場・メーカーのみアクセス可)"""
user_company_id = get_company_id_from_token()
# 自社の在庫のみ取得(他社データは取得不可)
inventory = Inventory.query.filter_by(company_id=user_company_id).all()
return jsonify([inv.to_dict() for inv in inventory])
The 3-layer structure of security measures:
- Authentication layer: AWS Cognito JWT-token verification
- Authorization layer: per-API access control per user attribute
- Data layer: Row-Level Security (RLS) by company ID
Lesson: Security isn't "authenticate and you're OK" but is designed in multiple layers per page, per API, per data. Especially in B2B SaaS, protecting competitors' data is the top priority.
Lesson 3: Thoroughly validate on "both front and back"
The challenge: preventing invalid data from mixing in
In B2B SaaS, data accuracy directly ties to the trustworthiness of inter-company transactions. If an invalid price, inventory count, or order quantity mixes in, the whole business collapses.
The implementation: double validation with zod + Marshmallow
Frontend (TypeScript + zod):
import { z } from "zod";
const OrderSchema = z.object({
product_id: z.string().uuid(),
quantity: z.number().int().positive().max(10000),
unit_price: z.number().positive().max(1000000),
delivery_date: z.string().datetime(),
});
type Order = z.infer<typeof OrderSchema>;
// フォーム送信前にバリデーション
const handleSubmit = (data: Order) => {
const result = OrderSchema.safeParse(data);
if (!result.success) {
alert(result.error.errors[0].message);
return;
}
// API送信
};
Backend (Python + Marshmallow):
from marshmallow import Schema, fields, validate, ValidationError
class OrderSchema(Schema):
product_id = fields.UUID(required=True)
quantity = fields.Integer(required=True, validate=validate.Range(min=1, max=10000))
unit_price = fields.Decimal(required=True, validate=validate.Range(min=0, max=1000000))
delivery_date = fields.DateTime(required=True)
@app.route('/api/orders', methods=['POST'])
def create_order():
schema = OrderSchema()
try:
data = schema.load(request.json)
except ValidationError as err:
return jsonify({'errors': err.messages}), 400
# DBに保存
order = Order(**data)
db.session.add(order)
db.session.commit()
return jsonify(order.to_dict()), 201
Lesson: Frontend validation is "UX improvement," backend validation is "security and data consistency." By thoroughly doing both, you completely prevent invalid data from mixing in.
Lesson 4: Introduce IaC (Infrastructure as Code) "from day one"
The challenge: securing the reproducibility of a complex AWS environment
In B2B SaaS, you combine a broad range of AWS services like VPC, ECS, RDS, Cognito, ALB, CloudFront, and SES. Manual construction has no reproducibility, and recovery on failure is difficult.
The implementation: full automation with Terraform
# VPC設定
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-vpc"
Environment = var.environment
}
}
# ECS on Fargate
resource "aws_ecs_cluster" "main" {
name = "${var.project_name}-cluster"
}
resource "aws_ecs_service" "app" {
name = "${var.project_name}-app"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.app.arn
desired_count = var.environment == "production" ? 3 : 1
launch_type = "FARGATE"
network_configuration {
subnets = aws_subnet.private[*].id
security_groups = [aws_security_group.ecs_tasks.id]
}
load_balancer {
target_group_arn = aws_lb_target_group.app.arn
container_name = "app"
container_port = 8080
}
}
# AWS Cognito User Pool
resource "aws_cognito_user_pool" "main" {
name = "${var.project_name}-user-pool"
# 8種類のユーザー属性をカスタム属性で定義
schema {
name = "user_type"
attribute_data_type = "String"
mutable = true
}
schema {
name = "company_id"
attribute_data_type = "String"
mutable = false
}
}
The merits of IaC:
- Reproducibility: build the infrastructure in one shot with
terraform apply - Version control: track the change history with Git management
- Environment separation: separate dev/staging/production with
terraform workspace - Disaster recovery: recover the entire infrastructure in minutes
Lesson: IaC should be introduced not "later" but "from day one." Manual construction → IaC migration accumulates enormous technical debt.
Lesson 5: Optimize performance with "asynchronous processing"
The challenge: UX degradation from heavy processing
In B2B SaaS, heavy processing like PDF/Excel generation of "quotes, delivery notes, invoices" and "turning existing Excel into a DB" frequently occurs. With synchronous processing, the user waits tens of seconds, and UX degrades markedly.
The implementation: thread parallelism with ThreadPoolExecutor + event-driven Lambda
Heavy processing has two natures, and I chose a different means for each. Document generation (order forms, delivery notes, invoices) is CPU/IO-bound and I want it to complete synchronously, so thread-parallel with ThreadPoolExecutor. Excel→DB ingestion has unpredictable time, so I completely offload it to a Lambda triggered by an S3 upload.
from concurrent.futures import ThreadPoolExecutor, as_completed
from sqlalchemy.orm import selectinload
def parallel_create_documents(app, order_id: str) -> None:
"""注文書・納品書・請求書を同時生成する。"""
tasks = [create_order_form, create_delivery_note, create_invoice]
def run(task):
with app.app_context(): # スレッドごとにコンテキストを張る
doc = (
Document.query
.options(selectinload(Document.lines)) # N+1 を選択ロードで回避
.filter_by(order_id=order_id)
.with_for_update() # 行ロックで競合生成を防ぐ
.one()
)
return task(doc) # openpyxl で Excel → LibreOffice で PDF
with ThreadPoolExecutor(max_workers=len(tasks)) as pool:
futures = [pool.submit(run, t) for t in tasks]
for f in as_completed(futures):
f.result() # 最初の例外を伝播させる
Design points:
- The three document types are generated in parallel. Re-establish Flask's app context in each thread, prevent N+1 with
selectinload, and prevent contention withwith_for_update. - Excel ingestion is separated to Lambda. Open
openpyxlwithread_only=Trueand bulk-INSERT withexecute_values. Also implement a 50MB cap and formula-injection neutralization (CWE-1236). - The front waits for completion with exponential-backoff + Page-Visibility-aware polling, so the admin panel doesn't freeze during heavy processing and doesn't waste the API in a background tab.
Lesson: "Making it asynchronous" has multiple means. CPU/IO-bound processing you want to complete synchronously is thread-parallel; processing with unpredictable time is event-driven and separated from the server body — using them differently by nature is production quality.
Lesson 6: Build payments idempotently with "Stripe Connect"
The challenge: recurring billing + transaction settlement + duplicate delivery
This product's revenue model is monthly subscription, but not only that. Since it's a marketplace where companies transact with each other, settlement per transaction occurs in addition to recurring billing. So instead of plain Stripe, I adopted Stripe Connect. In payments, the requirement is to absolutely never cause double charge, miss, or amount tampering.
The implementation: Stripe Connect (server-side amount resolution + idempotency key)
import stripe, hashlib
stripe.api_key = os.getenv("STRIPE_SECRET_KEY")
def create_subscription(customer: User, plan_id: str) -> dict:
# 金額は plan_id からサーバ側で解決する(クライアント指定額を信用しない=改ざん対策)
subscription = stripe.Subscription.create(
customer=customer.stripe_customer_id,
items=[{"price": plan_id}],
payment_behavior="default_incomplete",
expand=["latest_invoice.payment_intent"],
# 同一内容の再送が二重課金にならないよう冪等キーを付ける
idempotency_key=f"sub_{customer.id}_{plan_id}",
)
return {"subscription_id": subscription.id}
Protect idempotency in two layers:
- Layer 1 (Stripe API): weave "a hash of the content" into the idempotency key, so same-content re-sends are safe and content changes go to a different key.
- Layer 2 (Webhook): receive webhooks not in the Flask body but in 3 Lambdas, and eliminate duplicate events with DynamoDB's conditional write (
attribute_not_exists, 30-day TTL). If the table name is unset, stop startup (fail-closed). - Outbox: write the billing adjustment to the
outboxin the same DB transaction as the business transaction, and a separate Lambda reliably sends it to Stripe.
Lesson: Payments aren't "done by calling Stripe." Only by doing all the way to server-side amount resolution + an idempotency key + webhook deduplication + an outbox does it withstand retries and duplicate delivery. Bugs around billing directly tie to "revenue loss" and "loss of trust."
Lesson 7: Design CI/CD as "the automation of quality assurance"
The challenge: human error at deploy time
In B2B SaaS, downtime means "the stoppage of inter-company transactions." If the system stops from a mistake at deploy time, the customer companies' businesses stop.
The implementation: GitHub Actions + ECS auto-deploy
# .github/workflows/deploy.yml
name: Deploy to ECS
on:
push:
branches: [main]
jobs:
test-and-deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
# リンター実行
- name: Run ESLint
run: npm run lint
- name: Run Flake8
run: flake8 app/
# 脆弱性診断
- name: Security Audit
run: |
npm audit --production
pip-audit
# テスト実行
- name: Run Tests
run: pytest tests/
# Dockerイメージビルド
- name: Build Docker Image
run: docker build -t my-app:${{ github.sha }} .
# ECRにプッシュ
- name: Push to ECR
run: |
aws ecr get-login-password --region ap-northeast-1 | docker login --username AWS --password-stdin ${{ secrets.ECR_REGISTRY }}
docker push my-app:${{ github.sha }}
# ECS デプロイ
- name: Deploy to ECS
run: |
aws ecs update-service --cluster my-cluster --service my-app --force-new-deployment
The effects of CI/CD:
- Automatic testing: run all tests per commit, find bugs early
- Automatic deployment: deploy completes with just
git push - Rollback: on deploy failure, automatically roll back to the previous version
Lesson: Design CI/CD not as "the automation of deployment" but as "the automation of quality assurance." By thoroughly doing linters, vulnerability scanning, and tests, you minimize production bugs.
Summary: the 7 golden rules of B2B SaaS development
- Select technology by working backward from "the industry's complexity"
- Design security "per page, per API"
- Thoroughly validate on "both front and back"
- Introduce IaC (Infrastructure as Code) "from day one"
- Optimize performance with "asynchronous processing"
- Operate monthly billing stably with "Stripe"
- Design CI/CD as "the automation of quality assurance"
These lessons are the result of two years of trial and error. B2B SaaS development needs a completely different design philosophy from consumer apps. Giving the highest priority to the trustworthiness, security, and scalability of inter-company transactions, and having a consistent strategy from technology selection to architecture design, is the key to success.
It's achievable in your project too
If you're facing "DX of a legacy industry," "B2B SaaS development," or "agonizing over technology selection," I can support you. From requirement definition to design, implementation, and infrastructure construction, I can handle it one-stop.
I offer a free technical consultation (30 minutes), so please feel free to contact me.