The speed of writing code with generative AI has risen to a level there's no going back from. I myself, one-person × generative AI (Claude Code), have built an METI-Minister's-Award-winning B2B SaaS and a payments platform with 0 double charges in production. That's exactly why I can say it clearly — AI is fast, but left alone it mass-produces vulnerabilities at the same speed.
This isn't a gut feeling; it's a fact backed by research.
- Developers using AI assistants are more likely to write vulnerable code than those who don't. Moreover, they tend to overconfidently believe "I wrote secure code" (Stanford University, Perry et al. 2022).
- About 40% of code generated by GitHub Copilot contained exploitable vulnerabilities across 89 security-relevant scenarios (NYU, Pearce et al. 2021, "Asleep at the Keyboard?").
But this is not a "don't use AI" story. My conclusion is the opposite. AI's speed can be reconciled with production quality if you offset the debt with a verification gate before release. The key is the role split of "delegate implementation to AI, and have humans hold spec decisions and verification." This article shows that verification = vulnerability assessment of AI-generated code end-to-end, from data and new risks to real code and verification-gate design.
1. Why does AI generate vulnerabilities — three structural reasons
AI produces bugs not because "it isn't smart." There are structural reasons, and understanding them narrows the diagnostic target.
- It reproduces the flawed patterns of its training data. AI learns the world's code. Since that includes a large amount of insecure code, it probabilistically outputs "common ways of writing" = "common vulnerable patterns." SQL string concatenation and direct insertion of
dangerouslySetInnerHTMLare typical. - It doesn't know the context (your business rules). AI doesn't know "who may access this API." So it blithely omits authorization checks. This is #1 of the top 5 below.
- It hallucinates. It confidently generates non-existent functions and non-existent packages. This is the breeding ground for the next section's new risk, "slopsquatting."
In other words, AI's weaknesses are three: "reproduction of structural flaws," "lack of context," and "hallucination." Diagnosis targets these three.
2. The AI-specific new risk "slopsquatting" — hallucination becomes a supply-chain attack
Named in 2025, a vulnerability specific to the AI era is slopsquatting. The mechanism is this.
- AI suggests a non-existent package name, "install this."
- An attacker pre-registers that hallucinated package name on npm/PyPI and plants malicious code.
- When a developer runs
npm installas the AI says, the malicious code is taken into production.
This isn't a pipe dream. According to research, 19.7% of 2.23 million samples produced by 16 generative models contained non-existent package names. Moreover, the same hallucination is easy to reproduce (re-running the same prompt produced the same hallucinated name 43% of the time), so attackers can predict "which name to register to land a hit" (DevOps.com / Cloud Security Alliance).
This is the latest AI-era risk that hits directly at A03 Software Supply Chain Failures, newly established/expanded in OWASP Top 10:2025. The defenses are these three.
# ① 提案されたパッケージが「実在し、まともか」を install 前に検証する
npm view <package-name> # 存在しない/極端に新しい/DL数が異常に少ないものは疑う
# ② lockfile を信頼の唯一の源にし、不意の解決を防ぐ(CIは必ず ci を使う)
npm ci # package-lock.json に完全一致させる(install ではなく ci)
# ③ 依存を毎回スキャン(slopsquattingは「見覚えのない依存の増加」として現れる)
npm audit --audit-level=moderate
npx osv-scanner scan --lockfile=package-lock.json
An operational tip: when AI suggests adding a dependency, just having a human check once "does that name really exist" prevents most slopsquatting. In the
package.jsondiff review, always look for unfamiliar packages that have increased.
3. The top 5 vulnerabilities AI tends to mass-produce, and how to crush them
Diagnosing AI-generated code in practice, the same holes appear repeatedly. Here are the top 5 in order of frequency, with before/after code.
① Missing authorization (IDOR / OWASP A01) — most frequent and most serious
AI writes authentication (whether you're logged in), but omits authorization (whether that person owns that data).
// ❌ AIがよく書く:ログインは確認するが「所有者か」を確認しない → 他人のデータが見える
export async function GET(_req: Request, { params }: { params: { id: string } }) {
const order = await db.order.findUnique({ where: { id: params.id } });
return Response.json(order); // 他人のIDを入れれば、他人の注文が返る(IDOR)
}
// ✅ 所有権でフィルタする(「誰が所有するか」は事業ルール=人間が指定すべき不変条件)
export async function GET(req: Request, { params }: { params: { id: string } }) {
const userId = await requireUserId(req); // 認証
const order = await db.order.findFirst({
where: { id: params.id, ownerId: userId }, // 認可:所有権で絞る
});
if (!order) return Response.json({ error: "not found" }, { status: 404 });
return Response.json(order);
}
② Missing input validation (OWASP A05)
AI tends to write the "happy path" and may skip validation at the boundary. Always validate external input at the trust boundary.
// ✅ システムの境界(外から来る値)でZod検証し、型を絞ってから使う
const Body = z.object({ email: z.string().email(), age: z.number().int().min(0).max(150) });
const parsed = Body.safeParse(await request.json());
if (!parsed.success) return Response.json({ error: "invalid" }, { status: 400 });
③ Hardcoded secrets (OWASP A02/A03)
AI may write API keys directly into code as a sample. Limit server-only secrets to going through process.env, and prevent mixing with NEXT_PUBLIC_ via types (details). Physically stop them with Gitleaks + GitHub's Push Protection.
④ Old / non-existent dependencies (OWASP A03)
AI suggests old versions from its training-data point, or the previous section's hallucinated packages. Crush them with npm audit, Dependabot, and existence verification.
⑤ Insecure defaults (OWASP A02)
Opening CORS to *, forgetting HttpOnly/Secure on cookies, returning a stack trace on error — AI prioritizes a "minimal working configuration," so it tends to drop the safe-side defaults. Strengthen security headers/CSP and CORS in bulk via middleware.
4. Pre-release checklist — diagnose AI-generated code in four layers
For AI-generated code, the basic is to always pass it through four layers of automated diagnosis before merging. The top 5 above are almost entirely caught by these four layers (except authorization = ①; ① is in Section 6).
| Layer | The AI weakness it targets | Command | Corresponding OWASP |
|---|---|---|---|
| SCA | Hallucinated/old dependencies (slopsquatting) | npm ci + npm audit + OSV-Scanner | A03 |
| Secrets | Hardcoded keys | gitleaks + Push Protection | A02 |
| SAST | Structural flaws (injection, config) | semgrep scan --config=auto | A05 and others |
| DAST | Runtime behavior (reflected XSS, headers) | ZAP zap-baseline.py | A01/A05/A02 |
If you gate these four layers as PR checks in GitHub Actions, no matter how fast AI generates code, vulnerabilities are automatically blocked before merge. The concrete CI workflow (down to SARIF aggregation) is in the CI-integration procedure article, and the implementation details of the four layers are in the hands-on article on the OWASP official methodology. "Writing fast with AI" and "safely stopping it at a gate" hold only as a set.
5. Designing the human verification gate — "AI implements, humans hold spec and verification"
Tool gates are necessary but not sufficient. The key to turning generation speed into production quality is not letting go of the two things humans should hold.
- Humans hold spec decisions. "Who owns what, and which operations they're allowed" and "what are the invariants for amounts, quantities, and state transitions" — these are business rules AI must not be allowed to decide. Delegate to AI "an implementation that satisfies this spec."
- Drop acceptance criteria into tests. If you write "another person's order can't be seen with another person's ID" as a test, then even if AI omits authorization, the test fails and detects it. Preparing the verification path first is the biggest quality lever of the AI era.
// ✅ 認可の不変条件を「テスト」として固定する(AIが省略しても落ちる)
it("他人のリソースは取得できない(IDOR防止)", async () => {
const res = await GET(reqAs(userA), { params: { id: orderOwnedByUserB.id } });
expect(res.status).toBe(404); // 200で中身が返ったら認可バグ
});
This is exactly the workflow I consistently use. In the four phases of explore → plan → implement → verify, let AI produce speed while humans confirm spec and safety at each gate. The full picture of quality gates for AI-driven development is covered in the AI-driven-development quality-gates article.
6. The safety AI cannot create in principle — authorization and business logic are the human domain
Finally, the most important line. No matter how excellent the AI or the diagnostic tool, there's safety that cannot in principle be guaranteed. That is authorization (A01), tenant separation, and business logic (OWASP WSTG 4.5 / 4.10).
The reason comes down to the "lack of context" touched on in Section 1. "Who may see this invoice" and "how many times this discount can be used" are the 'meaning' of your business rules, and neither AI nor a scanner knows that meaning. So it can't judge a missing authorization as "missing." This is in contrast to SQL injection being "the same structural flaw in any app."
| What AI / tools can protect (horizontal) | What only humans can protect (vertical) |
|---|---|
| Injection, misconfiguration, known CVEs, secret leakage | Authorization/IDOR, tenant separation |
| Structural / known-pattern flaws | Business logic (abuse of quantity, price, state transitions) |
Beyond here is the domain not of tool gates but of manual authorization review and audit. In particular, "right before release, just after generating a lot of code with AI" is the timing where gaps in authorization and business logic most easily slip in — the biggest opportunity to bring in an audit. For detecting vertical risks and multi-layered defense, see the IDOR / broken-authorization detection article, and for the scope and cost feel of audits, see what does a security audit look at.
Summary — turn AI's speed into production quality with verification
- AI mass-produces vulnerabilities (Stanford: overconfidence; NYU: about 40% vulnerable). But the speed can be offset with verification.
- The AI-era new risk slopsquatting (hallucinated packages = A03). Prevent it with lockfile pinning, existence verification, and dependency scanning.
- Cure the top 5 AI mass-produces (missing authorization, missing input validation, hardcoded secrets, old/non-existent dependencies, dangerous defaults) with before/after patterns.
- Diagnose in four layers before release (SCA → secrets → SAST → DAST). Gate as PR checks with free tools.
- Humans hold spec decisions and verification. Don't hand authorization and business logic to AI. The safety AI can't create = the domain of audit.
"Writing fast with AI" and "shipping safely" are not in conflict. Offsetting, every time, the debt born behind the speed with a pre-release verification gate — with just this design, generative AI becomes the strongest weapon. It's exactly the method by which I keep building production-quality products with one-person × generative AI. When you need pre-release diagnosis of an AI-generated app, or an audit of authorization and business logic, start by visualizing the current state with the free OSS Aegis.