Infernet Protocol has four major components: the control plane, node daemons, inference backends, and the on-chain payment layer. Hereβs how they connect.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β CONTROL PLANE β
β (Next.js + Supabase) β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β Node β β Job β β Payment β β
β β Registry β β Router β β Accounting β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β HTTPS / SSE
βββββββββββββββββββββΌββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β NODE DAEMON β β NODE DAEMON β β NODE DAEMON β
β (infernet β β (infernet β β (infernet β
β start) β β start) β β start) β
β β β β β β
β βββββββββββββββ β β βββββββββββββββ β β βββββββββββββββ β
β β Ollama β β β β vLLM β β β β SGLang β β
β β (default) β β β β β β β β β β
β βββββββββββββββ β β βββββββββββββββ β β βββββββββββββββ β
ββββββββββ¬βββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ on-chain
βββββββββββββββββββββββββββββββ
β PAYMENT LAYER β
β (EVM / Solana / etc.) β
β Compute Payment Receipts β
βββββββββββββββββββββββββββββββ
The control plane is a Next.js application backed by Supabase. It serves two audiences:
The dashboard β a web UI for operators to see their nodesβ status, earnings, model inventory, and recent jobs. Clients can also use the dashboard to browse available models and manage API access.
The API β the REST endpoints that clients call to
submit jobs and that node daemons call to register, heartbeat, and poll
for commands. The primary job submission endpoint is
POST /api/v1/jobs.
Supabase handles persistence: node registrations, job records, payment accounting, and the command queue (model installs/removes). Supabase Realtime is used for live dashboard updates.
The control plane is open source. You can self-host it β see Chapter 6: Self-Hosting.
The node daemon is the process started by
infernet start. It does several things:
Heartbeat loop β every 30 seconds, the daemon sends a signed heartbeat to the control plane. The heartbeat includes: node public key, IP address, port, GPU stats, which models are currently loaded, and current load. If the control plane doesnβt hear from a node for 90 seconds, it marks the node offline.
Command polling β on each heartbeat cycle, the
daemon checks a command queue in the control plane. Commands include
model_install (pull a new model) and
model_remove (evict a model). Operators issue these
commands from the dashboard or CLI; the daemon picks them up and
executes them.
Job execution β when a job is routed to the node, the daemon receives it, calls the local inference backend, and streams the result back.
Auth β every request the daemon makes is signed with
the nodeβs secp256k1 private key. The signature covers the HTTP method,
path, body hash, a nonce, and a timestamp. This is carried in the
X-Infernet-Auth header. The control plane verifies the
signature against the nodeβs registered public key. The private key
never leaves the node.
The daemon doesnβt run inference itself. It delegates to one of five supported backends:
| Backend | Best for | Protocol |
|---|---|---|
| Ollama | General use, easy setup | REST at localhost:11434 |
| vLLM | High-throughput NVIDIA | OpenAI-compatible REST |
| SGLang | KV-cache reuse, structured output | OpenAI-compatible REST |
| Modular MAX | Throughput, modern NVIDIA | REST at configurable port |
| llama.cpp | CPU, Apple Silicon, GGUF models | REST via llama-swap |
The daemon probes each backend in priority order at startup and uses the first one that responds. You can override the selection with env vars.
Hereβs what happens from the moment a client submits a job to the moment they receive the last token:
Client Control Plane Node Daemon Backend
β β β β
β POST /api/v1/jobs β β β
βββββββββββββββββββββββ> β β β
β β route to best node β β
β ββββββββββββββββββββββββββ> β β
β β β POST /generate β
β β ββββββββββββββββββ> β
β GET /api/v1/jobs/:id/stream β β
βββββββββββββββββββββββββββββββββββββββββββββββββββ> β β
β β β <ββ token chunk βββ
β <ββ SSE: {text:"..."}ββββββββββββββββββββββββββββ β β
β <ββ SSE: {text:"..."}ββββββββββββββββββββββββββββ β β
β <ββ SSE: [DONE] βββββββββββββββββββββββββββββββββ β β
β β β β
β β job complete + CPR β β
β β <βββββββββββββββββββββββ β β
The client can poll GET /api/v1/jobs/:id for status or
open an SSE stream for real-time token delivery. Most applications use
the stream.
Supabase as the coordination layer β not a custom blockchain. Job routing, node registry, and command queuing run on Postgres with real-time capabilities. This keeps latency low and the stack familiar. On-chain components handle only what needs to be on-chain: payment settlement.
Nodes never trust the control plane with keys β the secp256k1 auth model means the control plane can verify node identity without ever holding private keys. A compromised control plane cannot impersonate a node or steal earnings.
Backends are swappable β the daemon speaks to all backends through a thin adapter layer. Adding a new backend is a matter of implementing the adapter, not changing the protocol.
SSE for streaming β Server-Sent Events are simpler than WebSockets for unidirectional streaming and work through most proxies and CDNs without special configuration.