The complete guide to commissioning system development: how to choose an outsourcing partner without failing, market rates, and in-house vs outsource from the decision-maker's view

Let me state the conclusion first. The biggest cause of failure in commissioning system development is not a lack of technical skill. It's "ambiguous requirements definition" and "naive estimation." What buyers should do first is not compare development companies, but (1) discern whether you truly need to build it (won't SaaS suffice), (2) understand the cost structure, and (3) reach a state where you can articulate and demand quality and security. With these three done, 80% of the commissioning's success or failure is decided.

This article provides a decision map for buyers (executives, business owners, IT staff) to not lose out, using as a "single source of truth" the design decisions of projects I actually worked on — an METI-Minister's-Award-winning B2B SaaS (lumber-industry DX), a payments platform that maintains 0 double charges during production operation, an enterprise AI platform for a major domestic broadcaster, and more. Each theme is detailed in a dedicated article, so read on from where you need.

A premise on numbers: the market rates in the text (person-month unit prices, development-cost ranges) are guides based on publicly available general survey/market information. Meanwhile, the quantitative values tied to my real projects (221 endpoints, 0 double charges in production, 4 rounds of security audits, etc.) are measured values verifiable from the repository, and I don't assert business ROI (revenue, % effort reduction) since that requires the client's real data. The policy is to not fabricate.

1. Why does commissioning system development fail "half" the time?

In the industry, the failure rate of system-development projects is said to be roughly half. Surveys like Nikkei BP Consulting's have repeatedly reported that only about half of projects met their initial QCD (quality, cost, delivery). What matters is the fact that most of that failure is decided "before writing code."

Let me narrow the root causes to three from my field experience.

Failure pattern	Typical symptom	Root cause
Ambiguous requirements definition	"It's different from what I imagined," "the spec swells later"	Starting without articulating the business flow and 'who, what, why'
Naive estimation	Budget overrun, extra costs, delivery delays	Holding effort with uncertainty at a fixed price / missing non-functional requirements (performance, security, operations)
Dumping and neglect	It got built but isn't used, breaks down in operation	The buyer steps down from decision-making, and it diverges from the field's operations

All three can be greatly mitigated by the buyer's preparation. Conversely, no matter how much you compare development companies' technical skill, skipping this preparation leads to failure. Per Gartner, B2B buyers spend about 80% of the purchasing process on self-learning before contacting a salesperson. In commissioning too, whether the buyer has decision axes divides the outcome.

2. The most important pre-commissioning judgment: before building, knock out the "option of not building"

The most expensive mistake is going out of your way to scratch-develop an operation that SaaS can cover. I believe the first job of a trustworthy development partner is to honestly advise "should this requirement be solved with off-the-shelf SaaS, or built."

The judgment can be made with a simple flow.

その機能は事業の差別化の核か？
├─ No  → まず SaaS / 既製品を探す（会計・勤怠・CRM・MA などは作らない）
└─ Yes → 既製品で要件を満たせるか？
          ├─ 満たせる → SaaS を採用し、足りない部分だけ連携・拡張する
          └─ 満たせない（業界特有の複雑な商流・独自の業務制約）→ スクラッチ／ハイブリッド

The lumber-industry DX I worked on was the archetype of the latter. The multi-stage commercial flow of "forestry → market → sawmill → precut → builder → manufacturer," permissions that differ by industry role, cross-company transactions, and tenant separation — these can't be expressed by off-the-shelf groupware or e-commerce, so scratch was justified. On the other hand, it takes a hybrid strategy: using off-the-shelf foundations to the maximum — AWS Cognito for auth, Stripe Connect for payments — and limiting what's built in-house to 'industry-specific logic only.'

This "in-house vs outsource" and "SaaS vs scratch" decision most greatly affects the commissioning's success or failure. The decision framework is detailed in a separate article.

3. Cost structure and market rate: an estimate is decided by "person-months"

What buyers feel most anxious about is cost. Japanese system-development cost is, almost without exception, decided by person-months.

開発費 ≒ 人月単価 × 投入人数 × 期間（月）

And about 80% of that cost is labor. So "expensive/cheap" is essentially the question of "how many people, for how many months, at what unit price." A guide to person-month unit prices is roughly as follows.

Role/level	Person-month unit-price guide
Junior to mid-level engineer (general contract)	600k–1M yen
Senior, tech-lead class	1M–1.6M yen
Big SIer/consultant	1.5M yen+

A guide to total-amount ranges by development scale is as follows (general market rates that vary greatly with requirements).

Development type	Cost-range guide	Example
SaaS introduction/configuration	~100k yen or so	Introducing/initial-setting off-the-shelf tools
Customization/small-scale	0.5–3M yen	Extending off-the-shelf, a small business app
Scratch, mid-to-large	3M yen–tens of millions	New business-system development, B2B SaaS

The key to spotting an estimate is not the size of the amount but the "breakdown" and "the presence of non-functional requirements." A cheap estimate usually omits the following.

Non-functional requirements: performance, availability, security, monitoring/operations, disaster recovery (DR)
Quality assurance: tests, code review, security audit
Operations/maintenance: post-release maintenance cost (a guide of around 15% of the development cost per year)

The failure "the initial estimate was cheap, but it broke down in operation" is mostly missing these. How to read estimates and cost optimization are detailed in a dedicated article.

4. How to choose an outsourcing partner: discern by "technical skill × track record"

When Japanese B2B buyers choose a development company, they prioritize technical skill and track record. Before lining up companies on an aggregator (Hatchu-navi, Aimitsu, etc.), evaluate the counterpart with the following checklist.

Outsourcing-partner evaluation checklist

Do they have published concrete track records (can they explain the project's scale, tech stack, and results)
Can they propose not building (can they honestly say where SaaS suffices? Answering "we can build it" to everything is a danger sign)
Do they ask about non-functional requirements themselves (do they nail down performance, security, operations, DR in requirements definition)
Can they articulate quality gates (the specifics of tests, type safety, CI/CD, security audit)
Do they take responsibility through operations/maintenance (is it build-and-done)
Is the estimate's breakdown transparent (the basis for person-months, the cost of non-functional requirements)

Answering the natural anxiety of "one-person × generative AI, is it OK?"

In recent years, commissioning to developers like me who leverage one-person (small-team) × generative AI has increased. While "fast and cheap," what buyers naturally hold is the anxiety of "is quality OK with one person" and "won't it become person-dependent." This is a correct question.

My answer is that what makes it "fast and cheap" is generative AI, but what ensures 'safety' is human verification gates. Generative AI raises the speed of writing code, but it becomes production quality only by passing its output through a multi-layered mechanism that doesn't trust it as-is — validation at type-safe boundaries, automated tests, static analysis, security scanning, and a third-party penetration test.

This isn't abstraction. In the lumber-industry DX, through four rounds of security audits including a third-party penetration test across 15 real roles, I demonstrated 0 missing-authorization findings across all 221 endpoints. In the payments platform, with the code structure of idempotency and atomic transactions, I keep double charges at 0 during production operation. Not "fast because AI wrote it," but "safe even when built fast, because it's hardened by verification" — this is the basis of quality to present to buyers.

5. Requirements definition that doesn't fail, and "staged introduction"

Since ambiguous requirements definition is the biggest failure factor, you should pour the most energy here. That said, no buyer can write perfect requirements from the start. What matters is to "not do a full migration all at once."

When systematizing legacy operations (phone, fax, Excel), I recommend the following staged introduction.

Phase 1（1〜3ヶ月）: 情報共有・可視化のみ（既存のExcel併用OK）
   ↓  現場が「これは便利だ」と実感する
Phase 2（4〜6ヶ月）: 一部業務（発注など）を移行（Excel併用）
   ↓  併存させ、移行リスクを下げる
Phase 3（7〜12ヶ月）: 全面移行（旧フローを段階的に廃止）

Do a full migration all at once and the field gets confused, becoming an "unused system." Set a coexistence period to build up the field's acceptance — this divides success or failure, especially in legacy-industry DX. In the lumber DX too, I built a path to upload existing Excel as-is and turn it into a DB, designing it so the field could migrate without discarding the tools they were used to. How to start legacy-operation DX is detailed in a dedicated article.

6. Discerning quality and security: the "four structures" buyers should demand

The difference between "something that works" and "a system that withstands production operations and a third party's attacks and audits" isn't visible from the outside. Buyers should demand whether the following four are held as a mechanism (structure). These are also the differentiator I consistently build in across all projects.

The structure to demand	Why it matters	Concrete example
Idempotency	Even if processing runs twice due to network drops or retries, the result converges to once	Payment double-charge prevention, webhook deduplication
Type safety / boundary validation	Make invalid data "unrepresentable" to structurally reduce bugs	Validate external input with TypeScript/Zod, ban `any`
Tests and CI/CD	Maintain quality mechanically without relying on human review	Enforce automated tests, type checking, security scanning in CI
Security audit	A trail of closing holes from the attacker's/auditor's perspective	Penetration test, a ledger of accepted risks

For example, in payments, you guarantee correctness with the code structure of "recompute the amount server-side," "prevent double execution with an idempotency key," and "deduplicate webhooks with conditional writes." Rely on operational carefulness and an incident will eventually happen. A concrete checklist for commissioning payment/billing systems is summarized in a dedicated article.

7. "One-person × generative AI" as a commissioning option

Finally, on the commissioning structure. Traditionally it was "commission a big SIer at scale" or "commission a contracting firm holding a multi-person team." To that, a third option has become realistic: one person (a small team) × generative AI handling everything from requirements definition through infrastructure, security, and operations one-stop.

This option suits cases like the following.

New ventures, MVPs, and business systems where speed and cost matter
Wanting to turn requirements quickly in direct dialogue with the decision-maker
Wanting to delegate it full-stack end-to-end and cut inter-company coordination cost

Conversely, projects premised on ultra-large scale and coordination among many stakeholders suit a big organization's structure. "Fast and cheap" and "a large-scale structure" are a trade-off, and a counterpart who can say so honestly is exactly the one you can trust.

A proposal to buyers: I handle requirements definition, design, frontend, backend, infrastructure (AWS/GCP/Terraform), security, and operations alone end-to-end, and ensure that quality with the verification gates of "type safety, tests, security audit, idempotency." An METI-Minister's-Award-winning B2B SaaS, a payments platform with 0 double charges in production, an AI video platform #1 in CrowdWorks contract ranking — these are all products of the same policy: "build fast, but harden with verification."

FAQ

Q. How much is the market rate for system development?

Cost is decided by "person-month unit price × people × period," and about 80% of the cost is labor. As a guide, SaaS introduction is up to ~100k yen, off-the-shelf customization or a small app is 0.5–3M yen, and scratch development of a business system or B2B SaaS is 3M yen–tens of millions. But it varies greatly with non-functional requirements like performance, security, and operations. Suspect that a too-cheap estimate may omit these.

Q. Is in-house or outsource better?

If it's a feature that becomes a core of the business's differentiation and needs to be grown over the long term, lean in-house. If it's highly specialized and a one-time/spot build, outsource has the edge. But for many SMBs, first confirming whether off-the-shelf SaaS suffices and outsourcing only the missing parts (or running it end-to-end with one-person × AI) is the most cost-efficient.

Q. If I commission a single developer, is quality and person-dependence OK?

What makes it "fast and cheap" is generative AI, but what ensures "safety" is human verification gates. By passing through automated tests, type-safe boundary validation, CI/CD, and security audits (including penetration tests) in multiple layers, you can achieve production quality even with a small team. When commissioning, confirm whether they can concretely explain "what quality gates they pass through."

Q. To not fail at commissioning, what should buyers prepare?

Three things: (1) discern whether you truly need to build it (won't SaaS suffice), (2) understand the cost structure (person-months, non-functional requirements), and (3) articulate the quality and security you demand. With these done, 80% of the commissioning's success is decided. Not doing a full migration all at once but introducing in stages is also important.

Q. Can legacy operations (phone, fax, Excel) be systematized?

They can. The key is "don't do a full migration all at once." Start from information sharing and visualization, and after the field feels the convenience, migrate operations like ordering in stages. Designing so the field can migrate without discarding the tools they're used to — like preparing a path to import existing Excel as-is — is the condition for success.

Summary: 80% of the commissioning's success is decided "before writing code"

To not lose out on commissioning system development, here are the five points buyers should grasp.

The true cause of failure isn't technical skill but ambiguous requirements definition and naive estimation — crush these first.
Before building, always knock out the "option of not building (SaaS)" — limit scratch to the core of differentiation.
Cost is decided by person-months, and 80% of the cost is labor — look at the "breakdown and non-functional requirements" rather than the amount.
Choose the outsourcing partner by "technical skill × track record," and choose one you can demand quality gates of.
One-person × generative AI is valuable not just for "fast and cheap" but by ensuring "safety" with verification gates.

I take on legacy-industry DX, new development/rebuilding of B2B SaaS, payment/billing platforms, and business adoption of generative AI — from requirements definition through infrastructure, security, and operations, one-stop at this level. If you're considering commissioning, let's first organize the current issues and "should you build it or not" together.

The complete guide to commissioning system development: how to choose an outsourcing partner without failing, market rates, and in-house vs outsource from the decision-maker's view

1. Why does commissioning system development fail "half" the time?

2. The most important pre-commissioning judgment: before building, knock out the "option of not building"

3. Cost structure and market rate: an estimate is decided by "person-months"

4. How to choose an outsourcing partner: discern by "technical skill × track record"

Outsourcing-partner evaluation checklist

Answering the natural anxiety of "one-person × generative AI, is it OK?"

5. Requirements definition that doesn't fail, and "staged introduction"

6. Discerning quality and security: the "four structures" buyers should demand

7. "One-person × generative AI" as a commissioning option

FAQ

Q. How much is the market rate for system development?

Q. Is in-house or outsource better?

Q. If I commission a single developer, is quality and person-dependence OK?

Q. To not fail at commissioning, what should buyers prepare?

Q. Can legacy operations (phone, fax, Excel) be systematized?

Summary: 80% of the commissioning's success is decided "before writing code"

In-house vs outsource, SaaS vs scratch: a decision framework for SMBs and startups

Breaking out of 'stuck at PoC' when adopting generative AI for your business: the walls to production, and a guide to commissioning in-housing support

How to modernize legacy systems and the costs: a practical guide to crossing the '2025 cliff' and breaking free from phone, fax, and Excel

How to build a payment system that prevents double charges, and a procurement checklist: guaranteeing 'correctness' structurally with idempotency and atomicity

Also worth reading

Vercel vs Netlify vs Cloudflare vs AWS: a tech-selection guide for Next.js/frontend platforms [2026 · an honest comparison]

Vercel migration guide: practical steps to switch over from self-hosting (AWS/EC2/Netlify) with zero downtime

The cost and market rate of web-app vulnerability assessment [2026 edition] — price bands by method, how to read estimates, how to choose without failing

1. Why does commissioning system development fail "half" the time?

2. The most important pre-commissioning judgment: before building, knock out the "option of not building"

3. Cost structure and market rate: an estimate is decided by "person-months"

4. How to choose an outsourcing partner: discern by "technical skill × track record"

Outsourcing-partner evaluation checklist

Answering the natural anxiety of "one-person × generative AI, is it OK?"

5. Requirements definition that doesn't fail, and "staged introduction"

6. Discerning quality and security: the "four structures" buyers should demand

7. "One-person × generative AI" as a commissioning option

FAQ

Q. How much is the market rate for system development?

Q. Is in-house or outsource better?

Q. If I commission a single developer, is quality and person-dependence OK?

Q. To not fail at commissioning, what should buyers prepare?

Q. Can legacy operations (phone, fax, Excel) be systematized?

Summary: 80% of the commissioning's success is decided "before writing code"

Related articles

In-house vs outsource, SaaS vs scratch: a decision framework for SMBs and startups

Breaking out of 'stuck at PoC' when adopting generative AI for your business: the walls to production, and a guide to commissioning in-housing support

How to modernize legacy systems and the costs: a practical guide to crossing the '2025 cliff' and breaking free from phone, fax, and Excel

How to build a payment system that prevents double charges, and a procurement checklist: guaranteeing 'correctness' structurally with idempotency and atomicity

Also worth reading

Vercel vs Netlify vs Cloudflare vs AWS: a tech-selection guide for Next.js/frontend platforms [2026 · an honest comparison]

Vercel migration guide: practical steps to switch over from self-hosting (AWS/EC2/Netlify) with zero downtime

The cost and market rate of web-app vulnerability assessment [2026 edition] — price bands by method, how to read estimates, how to choose without failing