メインコンテンツへスキップ
友田 陽大

We scanned 1,000 public Supabase apps: 9.2% have RLS that authenticates but doesn’t authorize

友田 陽大

Let me lead with the conclusion. We statically scanned 1,000 public Supabase apps on GitHub — 116,662 Row Level Security (RLS) policies. Of the 994 that ship RLS, 9.2% had a policy that authenticates the caller but does not authorize per row — a policy where any logged-in user can read every row. You can have RLS enabled and still have it not be authorization. (This replicates an earlier 450-app run that found 8.1% — the finding held at 2× scale.)

But the 9.2% isn't really the point of this article. The point is how we made that number trustworthy. Two ways. First, we manually ground-truthed all 235 findings against the real SQL: precision 1.0, zero residual false positives — a naive detector over-reports this class by ~2×, and most of that gap is false positives. Second, the run policed our own scanner: it caught a sibling rule firing on policies an anonymous user can never satisfy, and we drove that to zero. This article covers (1) what we measured and how, (2) the result and the core design mistake, (3) how we keep false positives at zero — including the ones this very run surfaced, and (4) how to measure your own app right now.

This is not "Supabase is dangerous" or "don't use AI." Shipping fast is fine. The question is whether the RLS you shipped fast is actually authorization. Aegis is the free, open-source toolkit behind this study — see the Aegis landing page, and for the application-security picture as a whole, the coverage matrix.


1. What we measured, and how

Before the number, the method. An "N% of apps are vulnerable" claim with no method behind it should not be trusted.

  • Population: via GitHub code search, we found 2,234 public repositories with a `CREATE POLICY` under `supabase/migrations` or `supabase/schemas` (the two places Supabase treats SQL as the source of truth), then scanned 1,000 of them on an independent seeded sample.
  • What we analyzed: the migration SQL (the authoritative schema) only. Static analysis exclusively — no deployed app was ever contacted (see ethics below).
  • The metric: a policy that has RLS but doesn't scope rows to the caller — specifically, a PERMISSIVE policy whose predicate only proves a session exists (`auth.role() = 'authenticated'`, `auth.uid() is not null`) with no owner binding like `auth.uid() = user_id`, on a table that has an ownership column.
MetricValue
Unique repositories found (code search)2,234
Repositories scanned1,000
Successfully analyzed998
Ship RLS (`CREATE POLICY` ≥ 1)994
RLS policies analyzed116,662

Every number in this article is pinned to a committed, reproducible artifact: the study methodology, the seven precision-hardening iterations, and the scripts are all public. You can re-run them yourself.

Ethics: public source, static only

The thing you must not do in a study like this is hit other people's production endpoints anonymously to confirm "data is reachable." That crosses into unauthorized-access territory. This study only read the SQL of public repositories — no running app was touched, and no repository is named (aggregate only).

How the 1,000 were chosen, and the biases that creates

Two honest caveats about the sample. First, the 1,000 are a seeded random draw from the 2,234-repo pool (independent of the earlier 450-app run, which used a by-name ordering) — reproducible, and independent of how anyone designs RLS. Second, discovery is keyword-based: the pool over-represents repos whose SQL contains common RLS idioms. Some of those keywords (`auth.uid()`, `to authenticated`) also appear in perfectly correct, owner-scoped policies — which pushes the rate down, not up. The denominator is never the search query; it is a ground-truth grep for `CREATE POLICY` on the cloned SQL.

This is a lower bound, not an upper bound

Developers who commit Supabase migrations to public GitHub skew careful. The most at-risk apps often never commit migrations or aren't public — so the true population rate is plausibly worse, not better.


2. The headline: 9.2% authenticate but don't authorize

Of the 994 repos that ship RLS, 91 (9.2%) had at least one policy that doesn't scope rows to the owner (235 such policies, ~0.2% of all 116,662 RLS policies). And — the most important fact about this number — all 235 were ground-truthed against the real SQL (zero residual false positives). The earlier 450-app run found 8.1%; the larger, independent sample lands in the same place.

The core mistake: `auth.role() = 'authenticated'` is not authorization

The most common pattern was this:

```sql -- Dangerous: RLS is ON, but it only checks "is the caller logged in?" create policy "notes_read" on public.notes for select to authenticated using (auth.role() = 'authenticated'); ```

`auth.role() = 'authenticated'` is true for any logged-in caller. Every authenticated user can read every row of `notes`. `auth.uid() is not null` (proving a session exists) is the same hole. RLS is "enabled," but authorization isn't happening.

This isn't hypothetical. CVE-2025-48757 (CVSS 9.3 CRITICAL, CWE-863 Incorrect Authorization) is exactly this class taken to its conclusion: an AI-builder platform's insufficient RLS let an unauthenticated attacker read and write arbitrary tables. The 9.2% is the step before that — authentication required, but rows not scoped to their owner.

The correct shape:

```sql -- Safe: bind the row to the caller. create policy "notes_read" on public.notes for select to authenticated using (auth.uid() = user_id); ```

(If the write side — `WITH CHECK` — only checks "is logged in," you get a different hole: any user can create rows they don't own or rewrite `user_id` to someone else's. That's an IDOR write.)

The honest caveat: "not owner-scoped" ≠ "vulnerable"

Don't trust an article that blurs this. There are tables where "every logged-in user can read it" is correct — public categories, shared lookups, collaboratively-edited docs. For those, an authenticated-only policy is a legitimate design.

So this detection is a medium-severity, non-blocking "verify this is intended" — not a "you're vulnerable." 9.2% does not mean "9.2% are exploitable"; it means "9.2% have a policy whose intent needs confirming." Refusing to fear-monger is what turns a number into trust.


3. Why you can trust the number: naive scanners over-report by ~2×

Here's the most valuable part. When we first built this detector, the original 450-app study let us measure exactly how much a naive version over-reports. Run those 450 repos through it raw and you get 19.3% (86 repos, 573 findings) — more than double the real rate. The difference isn't capability; it's false positives. Of the 573 raw findings, only 99 were genuine — about 83% were false positives.

Stage (original 450-app hardening)Repos flaggedFindingsNotes
Naive detection (raw)86 / 445 = 19.3%573~83% of these are false positives
After eliminating FP classes36 / 445 = 8.1%99Every one ground-truthed (0 residual FP)

That hardened classifier is what we then ran across 1,000 repos / 116,662 policies — and the 235 findings it produced were, on manual re-clone, all genuine (precision 1.0 at 2× scale). The precision wasn't a benchmark artifact; it held on data the author had never seen.

A scanner that claims "precision 1.0 on our benchmark" doesn't guarantee real-world precision. Real Supabase repos are full of shapes that look like the hole but aren't. The main false-positive classes:

False-positive shape (real example)Why it's a false positiveShare of 573 raw findings
`auth.role() = 'service_role'`The privileged backend role (it bypasses RLS); a regular user can never satisfy it. Restrictive, not a hole.46.8%
`"auth"."role"() = 'authenticated'`The quoted-identifier form `pg_dump` / declarative `supabase/schemas` emit. A detector built for bare `auth.role()` misreads the whole file.many
`auth.uid()::text = user_id::text`An owner binding with a cast on both operands. Correctly scoped, but misjudged as "the two sides don't match."many
`(select auth.uid() as uid) = user_id`The Supabase-recommended performance wrapper (with the `as uid` alias the CLI generates). Flagging the recommended pattern.many
`auth.uid() in (sender_id, receiver_id)`A participant binding (chat sender/receiver). An anon (null uid) is in no such list — owner-bound.some
`current_setting('role') = 'service_role'` / `auth.jwt() ? 'service_role'`Role checks via non-`auth.*` / the jsonb `?` operator. Again, not anon-satisfiable.some

The biggest chunk — `service_role`, 46.8% of raw findings — is a detector reacting to the string `auth.role()` without distinguishing `'service_role'` (backend-only) from `'authenticated'` (the real hole). And `(select auth.uid() as uid) = user_id` is literally the shape Supabase's own RLS performance docs recommend. A tool that reports the recommended pattern as a hole loses trust in one shot.

We identified each class against real data and rebuilt the classifier until false positives hit zero. The core move: instead of "mentions an `auth.*` token → flag it," we defined the hole positively — the hole is only `auth.uid()/auth.jwt() IS NOT NULL` or `auth.role() = 'authenticated'`. Everything else that mentions auth/role/claim (`service_role`, JWT claims, quoted identifiers, `current_setting`) is suppressed, fail-secure, because an anonymous caller can never satisfy it.

This is the one thing to look for when you pick a scanner — or an auditor. Not a tool (or person) that says "we find everything," but one that is honest about its false positives and can eliminate them against real data. A scanner that emits false-positive noise gets ignored by the team in three weeks.

This is why Aegis enforces `precision = 1.0` as a CI hard floor over a labeled real-world corpus: the claim "we measured our own precision" ships as a reproducible artifact, not a slogan.


4. The 1,000-repo run caught our own scanner's false positives

Here is the part most write-ups never show. The 1,000-repo run wasn't just a measurement — it was a stress test of the scanner itself, and it caught a sibling rule misbehaving.

`rls/anon-writable` flags a write policy that an anonymous visitor could satisfy (the closest thing to real-world harm). On the 1,000 repos it produced 80 findings — and on manual triage, about 68% (54 of 80) were false positives. The classifier was treating two shapes as "an anon could satisfy this" when an anon never could:

False-positive shapeWhy an anon can't satisfy it
Custom `auth.*()` helpers — `auth.is_admin()`, `auth.email()`, `auth.user_role()`, `auth.org_id()`A project-defined auth-schema function gated on a session/role/identity. The classifier only knew the three built-ins (`auth.uid/jwt/role`) and read the rest as a plain row condition.
`auth.role() IN ('service_role', 'supabase_admin')`The list form of a role gate. The classifier recognized `auth.role() = '…'` but not `IN (…)`, so it fell through.

The fix is the same fail-secure principle the owner-scope rule is built on: a predicate we can't verify is suppressed, never assumed exploitable. Unrecognized `auth.*()` helpers are now treated as unverifiable delegations; `auth.role() IN (…)` is recognized as a role gate. That cut `anon-writable` from 80 → 25 (the 25 are genuine anon-satisfiable predicates) with zero recall loss on the flagship rule (its 235 findings were unchanged on a same-sample re-scan). Both shapes are now locked as regression fixtures, so the CI `precision = 1.0` gate can never silently lose them.

This is what "we measure our own precision" actually means. Not a one-time benchmark — a corpus large enough that real code keeps surfacing shapes our fixtures didn't have, each one audited back to zero false positives before the number ships. A scanner that can't do this on real data quietly trains your team to ignore it.


5. Other signals (not audited to the same precision)

The same 1,000 repos produced other signals. These weren't audited for false positives as deeply as the owner-scope number, so treat them as indicative:

  • A table with RLS disabled entirely: 27.4% (273 / 998). A missing `ENABLE ROW LEVEL SECURITY`. A relatively hard signal, but still wants sampling.
  • Anon-writable exposure (an anonymous visitor can modify existing rows): a few percent after the hardening above. The closest to real-world harm — worth auditing first.

6. Measure your own app in seconds

If this made you wonder "are we one of the 9.2%?", you can check with no install and no config — the same detection this study ran, against your repo:

```bash

At your project root. Statically analyzes supabase/migrations and your TypeScript.

npx @aegiskit/cli scan ```

Wire it into CI to fail the build on high-confidence findings and upload SARIF to GitHub code scanning:

```yaml

.github/workflows/security.yml

name: Security on: [push, pull_request] jobs: aegis: runs-on: ubuntu-latest permissions: contents: read security-events: write # upload SARIF to the Security tab steps: - uses: actions/checkout@v4 - uses: tomodahinata/aegis@main with: severity: HIGH - uses: github/codeql-action/upload-sarif@v3 if: always() with: sarif_file: aegis.sarif ```


7. What a tool can and cannot measure — honest scope

This matters as much as the data. No static analysis can prove your authorization is correct (it's undecidable in general). A tool that claims to "find everything" produces the worst outcome of all — false confidence.

What a tool can measure stops roughly here:

  • Measurable (high confidence): the "tainted input → dangerous sink" dataflow of injection classes (SQLi/SSRF/XSS); whether an RLS predicate scopes to the owner; horizontal controls (headers/CSP, rate limiting, CSRF, a typed env boundary).
  • Not measurable (design & audit territory): whether an authorization decision is intended ("is this share correct?"); exhaustive tenant-crossing; flows that span multiple modules; business-logic abuse.

So the 9.2% is a map of where to confirm intent — not the fix. The fix — deciding "should every user really see this table?" — can only be done by a human who knows your data model. That's the line between "detect/warn, automated" and "fix by design, human."


8. FAQ

Is Supabase safe if I enable RLS?

No — "enabling" RLS and "authorizing" are different. In this study, 9.2% of the 994 RLS-shipping repos had a policy that only checked "is the caller logged in?" (e.g. `auth.role() = 'authenticated'`) without scoping rows to the owner. RLS is the frame; the owner-binding predicate is the authorization.

What's wrong with `auth.role() = 'authenticated'`?

It only proves the caller is logged in, which means every logged-in user can read every row. To scope by owner, bind the auth identity to the owner column: `using (auth.uid() = user_id)`. (For deliberately shared tables, authenticated-only is fine.)

Does the 9.2% mean "9.2% of apps are vulnerable"?

No. It's "9.2% have a policy that doesn't scope to the owner." How many are real vulnerabilities depends on whether the table is meant to be shared — a human design decision. That's why the finding is a non-blocking "verify this is intended."

Can I check my own app for free?

Yes. Run `npx @aegiskit/cli scan` at your project root — the same detection this study used. No install, no config, and it never touches a deployed app (static only).

When do I need a paid audit?

When you want the things a tool can't reach confirmed — the business correctness of authorization, exhaustive tenant isolation, multi-hop authorization flows — especially before an enterprise deal, a fundraising due diligence, or a production launch. See the security audit for scope and pricing.


Takeaway: trust the method, not just the number

9.2% of public Supabase apps with RLS (across 1,000 apps and 116,662 policies) have a policy that authenticates but doesn't scope to the owner. But the real claim of this article is that making that number trustworthy means eliminating, against real data, the false positives a naive detector emits — and continuing to, as a larger corpus surfaces shapes your fixtures never had. `service_role`, quoted identifiers, casts, the performance wrapper, custom `auth.*()` helpers — until you can tell "looks like a hole" from "is a hole," the number is meaningless.

Measure your own app with `npx @aegiskit/cli scan`. And remember: a tool draws the map of "confirm this" — it can't say "this is correct." When you want a human to verify your RLS is genuinely authorization and your tenant isolation actually holds, start from the Aegis overview and the security audit. The ability to build fast and the ability to build securely are two sides of the same coin.