Vercel

Fluid Compute

Python

TypeScript

サーバーレス

アーキテクチャ設計

可観測性

Run a backend on Vercel: operate Express, Hono, FastAPI, and NestJS in production with zero config

Vercel isn't frontend-only; it's a full-compute platform. It explains, per the official docs, how to run Express, Hono, NestJS (Node.js 24) and FastAPI (Python) with zero config. With real code: Node-server detection via server.listen, the fetch Web Handler, the api directory, the caveats of concurrency and global state on Fluid Compute, and coexisting with the frontend (Services).

Published: June 28, 2026
Reading time: 6 min read
Author: 友田陽大

Key takeaways

Vercel isn't a frontend-only host but a full-compute platform. You can run backend frameworks like Express, Hono, NestJS (Node.js) and FastAPI (Python) with zero config. Regular Node.js/Python runs on top of Fluid Compute.
A Node.js server is detected and turned into a function by Vercel just by 'placing server.{js,ts} at the root or in src/ and calling server.listen().' The listen port is for local; in production it's routed via an internal port.
An individual function can be written as a fetch Web Handler (export default { fetch }) or per-HTTP-method export (GET/POST). Express has a dedicated guide; Hono meshes well with the standard fetch handler.
The biggest caveat is Fluid Compute's global-state sharing. Since one instance processes multiple requests concurrently, placing request-specific data in module scope leaks it. Share only request-independent things like a DB connection pool.
To coexist the frontend (Next.js) and a Node.js server in one project, use Services. Separate heavy/long-running processing to Workflows/queues, and design on the premise of the 4.5MB body limit and 300-second timeout.

Contents

"Vercel is a place to put the frontend, and the backend needs a separate server, right?" — in 2026, this is a clear misconception. Vercel is a full-compute platform that runs backend frameworks like Express, Hono, NestJS (Node.js) and FastAPI (Python) as-is, with nearly zero config.

This article summarizes how to operate a backend in production on Vercel, faithful to the official specs of the Node.js runtime. For the full picture, see the Vercel production-operations guide; for function details, the Functions & Fluid Compute guide.

The three deployment forms

There are three forms to run a backend on Vercel.

Form	How to write	Suited scenario
Node.js server detection	`server.listen()` in `server.ts`	An existing Express/Hono/Fastify app "whole"
fetch Web Handler	`export default { fetch }` in `api/*.ts`	Lightweight, Web-standard, per-function
Per-HTTP-method export	`export function GET/POST` in `api/*.ts`	Per REST endpoint

① Load a whole Node.js server

Vercel detects server.{js,cjs,mjs,ts,cts,mts} at the project root or in src/, and using the server.listen() call as a clue, turns the HTTP server into a function. The port passed to listen() is for local execution; in production it's routed via an internal port (it doesn't become a public port).

// server.ts — 標準の Node.js HTTP サーバー（Vercel が検出）
import { createServer } from "node:http";

const server = createServer((request, response) => {
  const url = new URL(request.url ?? "/", `http://${request.headers.host}`);
  if (request.method === "GET" && url.pathname === "/health") {
    response.writeHead(200, { "Content-Type": "application/json" });
    response.end(JSON.stringify({ status: "ok" }));
    return;
  }
  response.writeHead(200, { "Content-Type": "text/plain" });
  response.end("Hello from Node.js on Vercel");
});

// ローカル用ポート。Vercel はこの listen を検出してサーバーを捕捉する
server.listen(Number(process.env.PORT ?? 3000));

② Load Hono (great fit with the fetch handler)

Since Hono is a Web-standard fetch handler, it meshes with Vercel's Web Handler as-is.

// api/index.ts — Hono を fetch Web Handler として
import { Hono } from "hono";

const app = new Hono();
app.get("/api/health", (c) => c.json({ status: "ok" }));
app.post("/api/echo", async (c) => c.json(await c.req.json()));

// Vercel は fetch エクスポートをそのまま関数として実行する
export default { fetch: app.fetch };

③ Load Express

Express is the most-used framework in Node.js, and Vercel has a dedicated guide. The basics are to ride server detection with server.listen(), or to make it a handler with a serverless adapter.

// server.ts — Express を Node.js サーバー検出に乗せる
import express from "express";

const app = express();
app.use(express.json());
app.get("/api/health", (_req, res) => res.json({ status: "ok" }));

app.listen(Number(process.env.PORT ?? 3000)); // Vercel が検出

④ NestJS

Since NestJS has Express/Fastify as adapters, the standard configuration of calling listen() in bootstrap() rides as-is (note that heavy initialization affects cold starts).

// server.ts
import { NestFactory } from "@nestjs/core";
import { AppModule } from "./app.module";

async function bootstrap() {
  const app = await NestFactory.create(AppModule);
  await app.listen(Number(process.env.PORT ?? 3000));
}
bootstrap();

⑤ FastAPI (Python)

Vercel runs Python (3.13/3.14) on Fluid Compute too. You can deploy FastAPI as an ASGI app.

# api/index.py — FastAPI on Vercel（Python ランタイム）
from fastapi import FastAPI

app = FastAPI()

@app.get("/api/health")
def health():
    return {"status": "ok"}

For FastAPI's own production design (async, Pydantic, DI), see the FastAPI production guide.

The biggest caveat: Fluid Compute's global state

With any framework, on Vercel (Fluid Compute) one instance processes multiple requests concurrently. So placing request-specific state in module scope leaks it.

// ❌ 危険：Express の app レベルやモジュールスコープに「現在のユーザー」を持つ
let currentUser; // 全リクエストで共有される

// ✅ 安全：リクエスト固有はハンドラ内のローカルに閉じる
app.get("/me", (req, res) => {
  const user = authenticate(req); // ローカル
  res.json(user);
});

// ✅ グローバルに置いてよいのは「リクエスト非依存」のものだけ
const pool = createPool(process.env.DATABASE_URL!); // 接続プール

The more accustomed someone is to "one process, one request" on a server, the more they overlook this. For details, see "shared global state" in the Functions & Fluid Compute guide.

The premises of backend design (limits)

The premises when designing a backend on Vercel (Functions limits):

Timeout: default 300s, Pro/Ent up to 800s. Send long-running processing to Workflows/queues.
Body: request/response body 4.5MB. Large uploads to Blob client upload.
Memory/CPU: Pro/Ent up to 4GB/2vCPU.
DB connections: prone to exhaustion from serverless concurrency → pooled connections (connection pooling).
State: be stateless. Put state externally (DB/Blob/Redis).
Cost: Active CPU billing (I/O wait not billed). I/O-centric APIs are efficient.

Coexisting frontend and backend: Services

If you want to coexist a Next.js frontend and a Node.js server in the same project, use Services. You can run the frontend as Next.js and the API as a separate Node.js server within one project. Suited to building a BFF pattern (a frontend-dedicated backend) in a monorepo.

To split microservice-style, separating the frontend into a Next.js project and the API into a separate project and connecting them with rewrites is also effective (vercel.ts rewrite).

When "a backend on Vercel" is a fit

Fit	Hard to fit
REST/GraphQL/BFF, webhook receivers	Resident WebSocket servers (to SSE or an external platform)
I/O-centric (DB/external API/AI)	Heavy CPU batches over a few minutes (to Workflows)
An API coexisting with Next.js	A huge monolith (over a 250MB bundle)
Spiking traffic (auto-scale)	A dedicated DB only in a VPC, not public (needs Secure Compute, etc.)

The non-fits are solved by "separation" — long-running to Workflows, resident to an external platform, huge by feature splitting.

Production checklist (backend on Vercel)

Choose the form: server detection (server.listen()) or a Web Handler
Not placing request-specific data in global state
DB with pooled connections, state to an external store (stateless)
Design on the premise of a 300s timeout and 4.5MB body, separate processing that exceeds it
Avoid heavy initialization and suppress cold starts
Frontend coexistence with Services, separation with rewrites
Cost design on the Active CPU billing premise, monitor with Observability

Summary

Vercel isn't "a place to put the frontend"; it's also a backend platform that runs Express, Hono, NestJS, and FastAPI with zero config.

Choose from the three forms (server detection / fetch Handler / per-method)
Watch Fluid's global state (request-specific to locals)
Design on the premise of stateless, pooled connections, 4.5MB, 300s
Solve long-running/resident/huge with separation
Frontend coexistence with Services

I take on, from design through production operation, an end-to-end configuration of Next.js frontend + Vercel backend (BFF, API, webhooks).

This article is based on the Node.js runtime / Functions limits official documentation (as of June 2026). Specs are updated, so confirm the latest values officially at production adoption.

友田

友田陽大

Developer of a METI Minister's Award–winning product. With TypeScript + Python + AWS, I deliver SaaS, industry DX, and production-grade generative AI (RAG) end to end — from requirements to infrastructure and operations — single-handedly.

Profile View work X note GitHub

I can take on the implementation from this article as an engagement

Vercel apps, from design to production and cost optimization

Function design assuming Fluid Compute (safe global state, waitUntil, Cron), four-layer caching (ISR/CDN/Runtime Cache/Cache Components), safe deploys (preview/Promote/Instant Rollback/Rolling Releases), entry-point defense (Firewall/WAF/BotID), storage selection (Blob/Edge Config/Marketplace), and Active-CPU-billing-aware cost optimization. With experience running Next.js products on Vercel in production, I deliver fast, cheap, and secure.

Discuss Vercel production operations

Available for both project-based (contract) and advisory engagements. Start with a free 30-minute consult.

See all articles in “Vercel in production”

Also worth reading

Back to the blog

Run a backend on Vercel: operate Express, Hono, FastAPI, and NestJS in production with zero config

The three deployment forms

① Load a whole Node.js server

② Load Hono (great fit with the fetch handler)

③ Load Express

④ NestJS

⑤ FastAPI (Python)

The biggest caveat: Fluid Compute's global state

The premises of backend design (limits)

Coexisting frontend and backend: Services

When "a backend on Vercel" is a fit

Production checklist (backend on Vercel)

Summary

Vercel production-operation guide: use it not as a front-end-only host but as a 'full-compute platform'

Vercel caching-strategy guide: using the 4 layers of ISR, CDN Cache, Runtime Cache, and Cache Components (PPR)

Vercel cost-optimization guide: understand the Active CPU pricing model and lower your bill

Vercel deployment & CI/CD guide: preview, Promote, Instant Rollback, and Rolling Releases at production quality

Also worth reading

DynamoDB Single-Table Design & Production Reliability Patterns — The Complete Guide (2026 Edition): Idempotency, Conditional Writes, and Transactions in Real Code

Making marshmallow Production-Quality: Performance Optimization, Testing, and Error Design

DynamoDB Capacity, Cost, and Performance Design Complete Guide (2026 Edition): On-Demand vs. Provisioned, Auto Scaling, Avoiding Hot Partitions, Cost Optimization

The three deployment forms

① Load a whole Node.js server

② Load Hono (great fit with the fetch handler)

③ Load Express

④ NestJS

⑤ FastAPI (Python)

The biggest caveat: Fluid Compute's global state

The premises of backend design (limits)

Coexisting frontend and backend: Services

When "a backend on Vercel" is a fit

Production checklist (backend on Vercel)

Summary

Related articles

Vercel production-operation guide: use it not as a front-end-only host but as a 'full-compute platform'

Vercel caching-strategy guide: using the 4 layers of ISR, CDN Cache, Runtime Cache, and Cache Components (PPR)

Vercel cost-optimization guide: understand the Active CPU pricing model and lower your bill

Vercel deployment & CI/CD guide: preview, Promote, Instant Rollback, and Rolling Releases at production quality

Also worth reading

DynamoDB Single-Table Design & Production Reliability Patterns — The Complete Guide (2026 Edition): Idempotency, Conditional Writes, and Transactions in Real Code

Making marshmallow Production-Quality: Performance Optimization, Testing, and Error Design

DynamoDB Capacity, Cost, and Performance Design Complete Guide (2026 Edition): On-Demand vs. Provisioned, Auto Scaling, Avoiding Hot Partitions, Cost Optimization