Let me state the conclusion first. Security measures fall into "horizontal controls" that libraries and tools can crush automatically, and "vertical risks" that only a human audit can close. Whether you can distinguish these two is what separates shipping safely on a limited budget from not.
- Horizontal controls = security headers/CSP, rate limiting, CSRF, input validation, known vulnerable dependencies, leaked secrets. Because they apply uniformly across apps, free tools can crush them mechanically.
- Vertical risks = authorization (IDOR/BOLA), the validity of Supabase RLS design, tenant isolation, business logic (abuse of quantity, price, state transitions). Because they depend on your business-specific rules of "who owns what and which operations they are allowed," a machine cannot judge their correctness.
The most cost-effective order is this. First sweep away horizontal controls with free tools, and send only the vertical risks that a machine cannot judge to an audit. Do everything by hand and the cost balloons; leave everything to tools and the most serious risk (authorization) slips right through. This article explains that boundary line — what a "security audit" actually looks at, how far automation gets you and where it falls short, when an audit is needed, and how it proceeds and what it costs — as honestly as I can.
Let me make the most important point up front. An audit does not promise "complete safety." If anyone or any product promises that, distrust them instead. The complacency of "we installed it / we got it audited, so we're fine" is exactly what produces the worst security outcomes. What an audit can do is systematically crush known critical risks, leave behind mechanisms that prevent regression, and honestly state the remaining risks — that far, and no further.
1. Security measures split into "horizontal" and "vertical"
Why does this dichotomy work? Because the optimal tool changes depending on "whether a library can correctly own the countermeasure."
| Horizontal controls (can be automated) | Vertical risks (require an audit) | |
|---|---|---|
| Examples | Headers/CSP, rate limiting, CSRF, input validation, vulnerable dependencies, leaked secrets | Authorization (IDOR/BOLA), RLS design, tenant isolation, business logic, privilege escalation |
| Nature | Apply uniformly across apps | Depend on the app-specific "who owns what" |
| Correctness criterion | Can be judged by the "shape" of config/implementation | Cannot be judged without understanding the "meaning" of business rules |
| Optimal tool | Free tools / libraries | Human design review and verification |
| What tools can do | Through installation and fixes | Through detection and warning (the fix is a design decision) |
What I want to emphasize here is that no product exists that "protects authorization just by installing it." Horizontal controls can be offloaded to libraries, but a tool that does not know your data model cannot close vertical risks. That is precisely why you should automate what can be automated as thoroughly as possible to lower costs, and have humans focus on the judgments machines cannot make — this is the cheapest and safest allocation.
Of these vertical risks, the most frequent and most serious is authorization. Object-level authorization flaws (IDOR / BOLA) have sat at #1 ever since the first edition in 2019 in the OWASP API Security Top 10 (OWASP API Security Top 10 (2023) / API1:2023 BOLA). It is not a "rare, sophisticated attack" but the "most ordinary leak." The full picture is summarized in Comprehensive Security Guide for Next.js × Supabase Apps, but this article narrows in on "what an audit is."
How do "audit," "vulnerability assessment," and "penetration test" differ?
The phrase "security audit" tends to be used loosely. Before you place an order, grasping the difference between adjacent terms lets you judge the validity of quotes and proposals.
| Term | Main question | Method | Best for |
|---|---|---|---|
| Vulnerability Assessment (VA) | Are known vulnerabilities left? | Tool-centric scanning (broad, automated) | Periodically and comprehensively sweeping out known holes |
| Penetration test | Can an attacker actually break in? | Manual attacks from the attacker's perspective (deep, narrow, proof-oriented) | Confirming the success/failure of specific attack scenarios |
| Security audit (this article) | Are the design and implementation done "correctly"? | Automated detection + design review + proof + remediation design | Assuring the validity of design such as authorization, RLS, tenant isolation |
| Code review | Are there flaws in this change? | Human reading (scope = diff/PR) | Routinely assuring quality per change |
The "audit" in this article incorporates the strengths of both vulnerability assessment (automated coverage) and penetration testing (proof), while placing its center of gravity on design review. That is because what most often causes leaks in Next.js × Supabase is not an "unknown sophisticated attack" but "vertical risks decided at the design stage, such as authorization and RLS design." Scan results alone cannot judge the validity of this design.
2. The range automated tools are "good at" (SAST / DAST / SCA / secret scanning)
First, the range you should crush before calling in a human audit. This can be sufficiently closed for free and automatically. You should not pay money for it.
| Tool category | What it looks at | Representative things it can close |
|---|---|---|
| SAST (static analysis) | Data flow of source code | SQLi, SSRF, path traversal, open redirect, DOM XSS |
| DAST (dynamic analysis) | Real requests to a running app | Reflected XSS, open redirect, SQLi (boolean/error inference), SSRF |
| SCA (dependency scanning) | package.json / lockfile | npm dependencies with known vulnerabilities (CVE), license issues |
| Secret scanning | Repository, diffs | Hardcoded keys, tokens, connection strings |
| Runtime hardening | Middleware/config | Security headers/CSP, rate limiting, CSRF/Origin verification, typed env boundary |
These can be judged without knowing the "app-specific meaning." SQL injection is the same structural flaw — "unsanitized input is concatenated into SQL" — in any app. That is why a library or scanner can correctly own it.
I make these horizontal controls free with my own OSS Aegis (MIT license). You can run SAST, DAST, secret scanning, and runtime hardening specialized for Next.js × Supabase with no install required.
# 1) 静的解析(SAST)— いまのプロジェクトのデータフローを追う
npx @aegiskit/cli scan
# 2) ランタイム強化 — ヘッダー/CSP・レート制限・CSRF・型付きenv境界を導入
npx @aegiskit/cli init
# 3) 動的確認(DAST)— 自分のアプリへ安全・非破壊なプローブを送る
npx @aegiskit/cli probe http://localhost:3000 --correlate
Note (honest scope): SCA (dependency vulnerabilities) is sufficiently covered by ecosystem-standard tools like
npm auditor GitHub's Dependabot/Renovate. There is no need to bundle it into a paid audit. Treat horizontal controls as "the free tools' coverage area," and spend the audit time on the vertical risks of the following sections — that is the right call.
Caution: even if a tool's result is "clean," that is not "safe." What automated tools look at is the presence or absence of structural flaws, not whether your authorization or tenant isolation is correct. Even a perfect scan score can leave vertical risks completely untouched (we return to this point in Section 7). That is exactly why the correct way to position automation is as a "prerequisite before entering the audit," not a "goal."
With the automation up to this point, a considerable proportion of the "number" of vulnerabilities in the world disappears mechanically. However — what doesn't disappear remains. And what remains is precisely the class that causes the most leaks.
3. The range automated tools are "bad at / cannot do" = the range that requires an audit
This is the core of the article. Why can't a machine judge vertical risks? Because they depend on the "meaning of business rules."
Example 1: Authorization (IDOR) is a perfectly valid request
GET /api/invoices/1024 ← Your own invoice (legitimate access)
GET /api/invoices/1025 ← Just increment the ID by one. If this returns someone else's invoice, that's IDOR
The request /api/invoices/1025 is perfectly normal as HTTP. The auth header is correct, and it contains no malicious strings. From a WAF's or scanner's viewpoint, it is a "legitimate request from a legitimate user." It becomes an attack only because of the app-specific business fact that "invoice #1025 is not owned by this user" — and the tool does not know your data model. How IDOR arises and how to close it is detailed in IDOR / Broken Authorization Detection & Fix Guide.
Example 2: The "validity" of RLS design is not decided by the presence of a policy
Supabase's service_role key runs with PostgreSQL's BYPASSRLS attribute and completely ignores RLS (Supabase: Row Level Security). In other words, "I put RLS on every table" alone does not mean it's safe. If a single service_role path forgets the ownership check, even perfect RLS can be leapt over entirely.
Tools can detect shape flaws like "RLS is not set," "there is no WITH CHECK," or "it is using (true)." But they cannot judge the design validity of "is this policy correct as this business's tenant boundary?" That can only be judged after understanding your organizational model (company → department → user, invitations, roles). For detecting RLS misconfigurations see RLS Misconfiguration Detection & Audit Guide, and for key handling see anon/service_role Key Exposure Guide.
Example 3: Business-logic flaws are "correct" as code
This is the area that is hardest for a machine. Look at the following code.
// 構文的にもデータフロー的にも完全に正しい。型も通る。
// だが業務ルール上は脆弱。
const total = unitPrice * quantity - couponDiscount;
From SAST's viewpoint, this code has no flaws. The types are correct, and no tainted input flows into a dangerous sink. However —
- What if you can send a negative number to
quantity?totalgoes negative = "you get a refund the more you buy." - What if you can apply
couponDiscountmultiple times? You can drivetotalto 0 or below. - What if you can state-transition a "draft" order directly to "confirmed," skipping payment?
Only a human who knows the business rules — quantity >= 1, a coupon is one-time only, payment always before confirmation — can tell these are "abuse." To a machine, they are all valid numbers and valid state transitions. When this becomes a multi-tenant SaaS, it combines with tenant crossing ("mixing in another tenant's ID to operate"), and the damage spreads at once (Verifying Cross-Tenant Leaks in Multi-Tenant).
State transitions in particular tend to be overlooked. The following API has perfect authentication and types, and even the ownership check is written correctly.
// 注文の状態を更新するAPI。認証済み・型安全・所有権チェックあり。
// だが状態機械のルールを検証していない。
await supabase
.from("orders")
.update({ status: nextStatus }) // ← "paid" を直接渡せてしまう
.eq("id", orderId)
.eq("user_id", user.id); // 所有権は正しく縛っている
Even though ownership is bound by user_id, the state-machine rule of "always go through payment between draft → paid" is missing. An attacker can send "paid" directly to status and confirm the order while skipping payment. Neither SAST nor RLS can judge this flaw as "abuse." Because it is type-correct, an operation by the owner themselves, and an update to a permitted column. Only a human who understands the domain knows "is this transition allowed by the business."
And there is another reality. In code mass-produced by AI, such vertical risks are especially likely to be baked in. In Veracode's 2025 study, which set 100+ LLMs 80 tasks across 4 languages, 45% of AI-generated code contained known security flaws, and security scores stayed flat even as the models got smarter (Veracode 2025 GenAI Code Security Report). AI writes "code that works in a demo" the fastest, but "don't show other people's data" and "reject negative numbers" lie outside the happy path and never surface in a demo.
There are real critical incidents too. CVE-2025-48757, registered in 2025, is a case where insufficient RLS on AI-builder-made sites let an unauthenticated attacker read and write arbitrary tables; its classification is CWE-863 (Incorrect Authorization), CVSS base score 9.3 CRITICAL. This CVE eloquently shows that authorization is your area of responsibility, which the platform does not take over for you.
Where does free "detection" diverge from an audit's "fix"?
To put it all on one page, it comes to this. Free OSS goes through detection and warning; a paid audit actually closes it with design and implementation. Even for the same risk, what you do differs.
| Risk dimension | What free OSS (Aegis) does | What a paid audit does |
|---|---|---|
| Authorization / IDOR・BOLA | Detect and warn, via static analysis, on tainted input with no ownership scope | Redesign the authorization model and implement ownership checks to close it |
| Supabase RLS | Detect unset / missing WITH CHECK / USING(true) / service_role paths | Design and implement correct RLS policies and verify they don't cross boundaries |
| Tenant isolation | Correlate non-admin access to tables with weak RLS as a "confirmed exposure" | Redesign tenant boundaries and prove isolation is not broken |
| Business logic | Out of scope (a library cannot judge quantity/price/state transitions) | Surface abuse paths through human review that understands the domain |
| Horizontal controls (headers/rate limiting etc.) | Auto-introduce with a single middleware and detect deviations in CI | Verify the installation status and close gaps/misconfigurations (the proof behind automation) |
"Detection" and "fix" are different jobs. Get detection done for free, and send only the parts that need a fix's design decision to an audit — this separation minimizes cost.
4. The items an audit actually inspects (checklist)
So, what does a human audit look at concretely? Not "look at everything vaguely," but structuring the viewpoints to crush them without omission. A good audit maps findings to a recognized verification standard like OWASP ASVS, presenting them as "reproducible grounds" rather than "personal opinion" (OWASP ASVS).
4-1. Authorization model / IDOR・BOLA
| Item inspected | What to confirm | Danger sign |
|---|---|---|
| Ownership enforcement | Do IDs the client can operate (route params, body, query) reach the DB narrowed by ownership? | Returning with only .eq("id", id) (no user_id, no RLS) |
| Centralized authorization | Is authorization consolidated in one place — the DB (RLS) or the server — so adding a route won't forget it? | Hand-writing ownership checks per route |
| Server Actions | Are "use server" arguments and formData also treated as "client-operable"? | The assumption "it's safe because the ID isn't shown on screen" |
4-2. Supabase RLS policies
| Item inspected | What to confirm | Danger sign |
|---|---|---|
| Enablement | Is RLS enabled on all tables (fail-secure)? | A table without enable row level security |
WITH CHECK on writes | Does INSERT/UPDATE verify the post-insert row satisfies the condition? | Only using, no with check |
| Unconditional permission | Is there a policy that is effectively no-guard, like using (true)? | using (true) / over-granting to anon |
| Function privilege escalation | Does a SECURITY DEFINER function fix its search_path? | A definer function without a fixed path (a breeding ground for privilege escalation) |
| Performance | Is auth.uid() wrapped as (select auth.uid())? | Re-evaluated per row on a large table = slow |
4-3. service_role paths
| Item inspected | What to confirm | Danger sign |
|---|---|---|
| Key location | Is the service_role key not leaked to the browser/client? | The key visible in front-end code or the network tab |
| Ownership enforcement | Does an API using service_role always add an ownership condition like user_id? | Just .eq("id", ...) while bypassing RLS |
| Path minimization | Is the service_role path limited only to where it's truly needed? | "Admin client rather than agonizing over it," used everywhere |
4-4. Tenant boundaries (multi-tenant)
| Item inspected | What to confirm | Danger sign |
|---|---|---|
| Isolation unit | Does the tenant boundary (company/org ID) take effect consistently across all queries and all policies? | The isolation key differs from table to table |
| Crossing test | With tenant A's session, is hitting tenant B's ID rejected? | Another tenant's data returns 200 |
| Aggregation/joins | Does the tenant boundary leak in JOINs, aggregations, RPC, or Storage? | "The main tables are fine," not looking at the join targets |
4-5. Input boundary, business logic, recurrence prevention
| Item inspected | What to confirm | Danger sign |
|---|---|---|
| Input boundary | Are SQLi, SSRF, XSS, open redirect, etc. validated at the boundary? | No boundary validation (Zod etc.) |
| Business logic | Did you surface abuse paths for quantity/price/discount/state transitions, understanding the domain? | "There are only happy-path tests" |
| Regression prevention | Do RLS regression tests (pgTAP) and CI gates remain? | No grounds beyond "it worked on my machine" |
In an audit, these are built on the foundation of correlating Aegis's static analysis (SAST), RLS verification, and dynamic confirmation (DAST), and finally the real harm is confirmed by human review. The tool raises "suspicions," and the human judges "can this really be abused in light of the business rules" — this division of roles is the trick to looking deep without adding noise.
The "design validity" an audit steps into — the RLS example
Tools look at "is there RLS or not." An audit looks at "is that RLS correct as this business's tenant boundary." For example, the following policy passes from a tool's perspective — RLS is enabled, and there is both using and with check.
-- 一見正しい。だが「ユーザー個人」単位でしか縛れていない。
create policy "members manage documents"
on documents for all
to authenticated
using ( (select auth.uid()) = user_id )
with check ( (select auth.uid()) = user_id );
However, if this is a B2B multi-tenant SaaS, it is flawed. The boundary should be not "the individual user" but "the organization (tenant) they belong to." As is, documents that should be shared by the team can only be touched by their creator, and authorization breaks down in the invite/leave flows for the organization. An audit redesigns this to the "org boundary."
-- 監査後:テナント(組織)を境界にし、所属を介して認可する。
create policy "org members manage documents"
on documents for all
to authenticated
using (
org_id in (select org_id from memberships where user_id = (select auth.uid()))
)
with check (
org_id in (select org_id from memberships where user_id = (select auth.uid()))
);
Seeing through this "works correctly, but is wrong as a business rule" is what a tool cannot do and an audit can. The gap between detection (shape) and validity (meaning) shows up right here.
How findings are written (sample)
An audit's deliverable is not an impression like "I think this is dangerous." So the development team can start work on the spot, it is a set of findings equipped with at least the following elements, as reproducible grounds.
[CRITICAL] authz/idor — Another tenant's invoice is retrievable
Location: app/api/invoices/[id]/route.ts:18
Path: params.id (source) → supabaseAdmin.from("invoices").eq("id", id) (sink)
Cause: The service_role path is missing the ownership condition (user_id / tenant_id)
Repro: With user A's session, GET /api/invoices/<B's invoice ID> → returns 200
Impact: Read access to every tenant's billing data (CWE-863 / OWASP API1:2023 BOLA)
Fix: Option A = move authorization into RLS / Option B = enforce .eq("tenant_id") on the service_role path
Verify: Turn "0 rows from other tenants" into a pgTAP regression test and add it to the CI gate
Only when severity (CRITICAL/HIGH/…), the source→sink trace, the steps actually reproduced, the impact scope, the remediation direction, and even the shape of the regression test are all present does a finding become a "fixable deliverable." Not "point it out and done," but "hand it over in a fixable form" — this is what separates audit quality.
5. When an audit becomes necessary
I won't say "everyone should get an audit right now." Thinking in terms of cost-effectiveness, an audit is needed in situations like these.
- Enterprise deals / RFP responses: In deals with large companies, security questionnaires and RFPs demand "third-party assessment results." Whether you can show your handling of
service_roleand tenant isolation in documents becomes a condition for winning the deal. - Compliance requirements (SOC2 / ISO 27001, etc.): In the process of obtaining certification, application-level vulnerability assessment is included as a requirement.
- Funding due diligence (DD): In investors' technical DD, authorization, data isolation, and dependency health are questioned. A "leakable design" directly affects valuation.
- After an incident: Once you've leaked, there's a high chance the same class of flaw lurks elsewhere too. You need to re-install the whole recurrence-prevention mechanism.
- Right before releasing code mass-generated by AI: This is the most common case now. As in Section 3, AI-generated code easily bakes in vertical risks. It works as the last gate for shipping what you built fast without leaking it.
Conversely, if you're at a stage where you haven't even automated horizontal controls yet, you should run the free tools before an audit. Get the order wrong and you'll pay audit fees for what you could have crushed for free.
For the same reason, there are also cases where you don't need to rush into an audit. A verification prototype not exposed externally, an MVP that doesn't yet hold user data, a stage with no external requirements yet — what you first need here is free automation and the minimal vertical checks of Section 8. An audit delivers its greatest cost-effectiveness when "data worth protecting" and "a party that asks about it" appear.
Reference: what RFPs / security questionnaires often ask
Examples of items actually asked in enterprise deals, for which an audit's findings directly become the answer's grounds.
- Is data logically isolated per tenant (the mechanism and verification method)?
- How do you enforce authorization (object-level access control)?
- Whether you've conducted third-party vulnerability assessment/audit, and the latest results
- Vulnerability management of dependencies (SCA) and patch operations
- Secret management (environment variables, key rotation, leak prevention)
- Incident response procedures, and preservation of logs/audit trails
Whether you can answer these "in documents" is the dividing line for winning the deal. An audit is also the work of creating the evidence behind these answers.
6. The approach (3 stages) and pricing
An audit is not a 0/1 of "take it or not." It is a 3-rung ladder where you can choose the depth according to what you want to gain. Pricing is stated on the security audit page, but the gist is this.
| Stage | Price (excl. tax) | Duration | What you gain | Fix implementation |
|---|---|---|---|---|
| Spot assessment | From ¥98,000 | About 1 week | A prioritized findings report (with source→sink traces / relevant SQL). Accurately visualizes "how many holes you have now" | Not included |
| Standard audit (recommended) | From ¥280,000 | About 2–3 weeks | Design review of authorization model, RLS, service_role paths, tenant boundaries + threat model + fix design plan + report session | Through design |
| Hands-on fix implementation | From ¥680,000 | Weeks~ | Implementation of fixes (RLS redesign, ownership checks, tenant isolation) + permanent regression tests/CI gates + proving "it's fixed" with a re-audit | Included |
The use of each stage is clear.
- Spot assessment is a "checkup." For when you first want to know your current position accurately. It does not include fixes, so it's for teams with in-house hands.
- Standard audit is a "full-scale audit that steps into design." It's the standard form for safely shipping/operating a B2B, multi-tenant SaaS, and provides up to a design plan for the fixes.
- Hands-on fix implementation is "don't end at detection — until it's actually closed and won't recur." With one person × generative AI (Claude Code), we move fast from design through implementation and testing.
All prices are an honest "from" guideline (excl. tax). They are neither fixed quotes nor caps. They vary with the scale of the target app (number of tables, number of endpoints, requirements), so we align scope in a free consultation first, then quote. If production data or confidentiality is involved, we accommodate an NDA, and dynamic confirmation (DAST) is performed against an environment you own, with only non-destructive probes that default to localhost, have a fixed scope, and a request budget (no destructive operations are performed).
Note that the target is mainly Next.js (App Router) × Supabase, where we can step in the deepest, fastest, and most safely. The mindset (authorization, input validation, defense in depth) is general-purpose, but for other configurations, please let us confirm the scope in a free consultation.
The actual flow of an audit (5 steps)
For either spot or standard, the rough flow is shared.
- Free consultation & scope fixing: We confirm the target repository, endpoints, tenant model, and the Supabase features in use (RLS/RPC/Storage/Edge Functions), and document the quote and scope.
- Automated detection: We run Aegis's static analysis (SAST), RLS verification of
supabase/migrations, and dynamic confirmation (DAST) to sweep out the holes a machine can pick up. This is within the free tools' range and becomes the "foundation" of the findings. - Manual review (the lead of the standard audit): We read through the authorization model, RLS policies,
service_rolepaths, tenant boundaries, and business logic with an understanding of the domain. We confirm the automated "suspicions" into "real harm" in light of the business rules. - Reporting (with a report session): We submit a findings report with severity, repro steps, and remediation direction, and align Q&A and priorities online.
- Remediation & re-audit (hands-on plan): We implement fixes, permanently install regression tests and CI gates, and prove "it's fixed" with a re-audit.
Questions to identify a good audit
Finally, I'll write this even though it works against my own promotion. Questions for the ordering side to see through the other party's quality.
- "How is the remaining risk reported?" — Avoid anyone who immediately answers "you'll be completely safe." An honest audit always explicitly states what's out of scope and the residual risk.
- "What do you leave behind for recurrence prevention?" — If the talk of regression tests or CI gates doesn't come up, it may be an audit that ends as a one-off.
- "How do you verify
service_rolepaths and tenant crossing?" — Whether they can describe concrete steps (ID swapping with 2 accounts / 2 tenants, etc.) reveals their understanding of vertical risk. - "In what form are findings handed over?" — The desirable form has source→sink traces, repro steps, and even remediation direction all present.
If the other party can answer these concretely, they truly understand vertical risk. Conversely, if they only say "leave it to the tools and you're fine," consider it highly likely that the class causing the most leaks slips right through.
7. The honest limits of an audit
This is the most important section on this page. I won't write platitudes.
An audit does not promise "completely safe" or "absolutely no leaks." Do not trust anyone who promises that. Security is not a one-time job but a continuous practice. What an audit can honestly provide is only these three:
- Systematically crush known critical risks. Following the checklist in Section 4, we sweep the most frequent and most serious classes (authorization, RLS, tenant isolation, business logic) without omission and close them by priority.
- Leave behind mechanisms that prevent regression. An audit's real value is not "fix it at that point in time" but leaving a structure that never opens the same hole again. We permanently install RLS regression tests (pgTAP) and a CI gate (only high-confidence detections stop the build, and SARIF is left in GitHub), putting a brake on new code written after the audit too.
- Explicitly state the remaining risks. We document the audit scope (target repository, endpoints, duration), and honestly write "we did not look at" what's out of scope. A "clean result" means "we didn't step on the common traps," not "safety was proven."
The audit scope is fixed in documents first. Not leaving what's included and what's not ambiguous is the first step of honesty.
| Included in the audit (examples) | Not included in the audit (examples) |
|---|---|
| The app's authorization, RLS, tenant isolation, input boundary, business logic | Physical/office security, human insider fraud |
| The design and implementation of Supabase (RLS/RPC/Storage) | Vulnerabilities in the Supabase platform itself (vendor's responsibility range) |
| Static/dynamic verification of the target repository and endpoints | Out-of-scope subsystems, the internals of external SaaS integration targets |
| The codebase at the time of the audit | Code added/changed after the audit (→ guaranteed by CI gates) |
What is especially easy to misunderstand is that an audit is a snapshot. Code written the day after the audit ends is out of the audit's scope. That is exactly why ② the CI gate is the essence, and "get checked once and done" comes up short.
And the limits of tools are the same. No static or dynamic tool can prove your authorization is correct. What a tool looks at is the "shape" of the implementation, not the "meaning" of the business rules. These three layers (SAST, RLS verification, DAST) and human review complement, not replace, threat modeling. Even so, the value of mechanically crushing the most frequent holes so humans can focus on the truly hard judgments is immeasurable.
8. First, what you can do yourself
Before calling in an audit — or before deciding whether to call one at all — do what you can for free first. It's the most cost-effective, and even if you do get an audit, the findings come out sharper.
- Sweep horizontal controls with free tools. Start with grasping the current state.
npx @aegiskit/cli scan # SAST: いまの穴を可視化 npx @aegiskit/cli init # ランタイム強化を導入 npx @aegiskit/cli probe http://localhost:3000 --correlate # DASTで裏取り - Crush dependencies (SCA) with
npm audit/ Dependabot. Known vulnerable dependencies disappear here. - Do the minimal vertical checks yourself. RLS on all tables, ownership checks on
service_rolepaths, and one ID-swapping test with 2 accounts. These three points alone prevent the majority of CVE-2025-48757-class accidents.
What remains even after this far is the vertical risk a machine cannot judge — the validity of the authorization model, RLS design, tenant boundaries, business logic. If you have anxiety there, that's where an audit comes in. For details on the free tools see the Aegis (OSS) page, and for the audit's scope and pricing see the security audit page.
If you're torn between "raising people who can assess this vertical risk in-house" and "leaving it to an external expert," The Work, Salary, and Career of an Ethical Hacker organizes the decision axes of "raise vs. delegate." If you aim to become the assessor yourself, head to How to Become a White-Hat (Ethical) Hacker — The Complete Roadmap.
Frequently Asked Questions (FAQ)
Q. There's free OSS, so why is a paid audit necessary? A. Because it's a different job. OSS (Aegis) automates the horizontal controls a library can correctly own, and handles vertical risks a library can't fix only through detection and warning. Actually closing those vertical risks is design judgment and implementation, and that's the audit's role. Not confusing detection (free) and fix (audit) is the trick to avoiding wasted spending.
Q. If the automated tool's score is perfect, am I safe now? A. No. As in Section 7, a clean result means "you didn't step on the common traps," not "safety was proven." Authorization, tenant isolation, and business logic are areas a tool structurally cannot judge. A perfect score is a starting point, not a goal.
Q. Should I choose a spot assessment or a standard audit? A. If you have in-house hands to fix and want "an accurate grasp of your current position," the spot assessment (about 1 week, from ¥98,000). If you want to safely ship B2B/multi-tenant and want us to step into design, the standard audit (about 2–3 weeks, from ¥280,000). If you're unsure, we propose after seeing the target in a free consultation.
Q. Will I become "completely safe" if I get an audit? A. No. Don't trust anything that promises that. An audit systematically crushes known critical risks and leaves behind mechanisms (tests, CI gates) that prevent regression. But there's no "complete" or "absolute," and the remaining risks are explicitly stated. This is not timidity but the premise of handling security honestly.
Q. I'm anxious about handing over production data. A. We accommodate an NDA. Dynamic confirmation (DAST) is performed against an environment you own, with only non-destructive probes that default to localhost, have a fixed scope, and a request budget, and we do no destructive operations. We base the design on not receiving secrets, and limit the necessary permissions to a minimum.
Q. How often should I get an audit? A. There's no one-size-fits-all answer, but a guideline is "every major design change" and "a periodic one about once a year." However, more important than frequency itself is permanently installing a CI gate that prevents regression, keeping daily changes automatically watched. An audit is a deep inspection at milestones; CI is the daily watch — their roles differ.
Q. If I ask AI to "write it securely," will an audit become unnecessary? A. Don't over-expect. In Veracode's study, security scores stayed flat even as models got smarter. AI is strong at generating "code that works," but does not guarantee a "structure that doesn't break." Use AI's speed, and only by passing it through verification gates (scan, test, review) does it become production quality.
Q. Do solo or small-scale developers need an audit? A. First, be sure to do the three free points in Section 8. Often that's enough. An audit is needed when "third-party grounds" or "assurance of design validity" become necessary — enterprise deals, compliance, funding DD, after an incident, before releasing AI-mass-generated code, and so on.
Summary: crush with automation, close with an audit, stop recurrence with CI
Let me organize the key points.
- Security measures split into "horizontal controls you can automate" and "vertical risks that require an audit." The most cost-effective order is to first crush the horizontal with free tools, and send only the vertical risks a machine can't judge to an audit.
- Automated tools (SAST/DAST/SCA/secret scanning) are good at injection, headers/CSP, rate limiting, vulnerable dependencies, leaked secrets. You can crush them mechanically with free OSS (
npx @aegiskit/cli scan). - What only an audit can close is authorization/IDOR, the validity of RLS design, tenant isolation, business logic. A machine can't judge them because they depend on the "meaning" of business rules.
- What an audit looks at is the authorization model, RLS policies,
service_rolepaths, tenant boundaries, input boundary, business logic, recurrence prevention — crushing a structured checklist mapped to a standard like ASVS. - Honestly, an audit does not make you "completely safe." What it can do is systematically crush known critical risks, leave behind a CI gate that prevents regression, and explicitly state the remaining risks. Do not trust anyone who promises "absolute safety."
Building fast with AI is itself correct. Hardening what you built fast safely, without leaking it — I hope this article serves as material for judging how to draw that boundary line with free automation and a human audit. First, visualize your current position with Aegis (free OSS), and if anxiety about vertical risk remains, consult us at Security Audit. A concrete example of safely operating a multi-tenant B2B SaaS is summarized in the Lumber-distribution DX case study.
References
- OWASP API Security Top 10 (2023)
- OWASP API1:2023 — Broken Object Level Authorization (BOLA)
- OWASP Application Security Verification Standard (ASVS)
- Veracode — 2025 GenAI Code Security Report (45% of AI-generated code has security flaws)
- Supabase Docs — Row Level Security (service_role bypasses RLS)
- NVD — CVE-2025-48757 (unauthenticated access via insufficient RLS, CWE-863, CVSS 9.3 CRITICAL)