Infernet Protocol

Getting started

Pick your path

Four ways to use Infernet Protocol. Each track is a copy-paste sequence; most flows finish in five minutes. Need depth? See the full docs or the book.

Track 1

Use the network — chat / completions

The public endpoint is OpenAI-compatible. If you've ever called OpenAI's API, you already know how to use it. Try it in 30 seconds:

curl

curl https://infernetprotocol.com/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen2.5:7b",
        "messages": [{"role": "user", "content": "What is Bitcoin?"}],
        "stream": false
    }'

JavaScript / Node

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://infernetprotocol.com/v1",
    apiKey: process.env.INFERNET_API_KEY ?? "no-key-needed-for-playground"
});

const res = await client.chat.completions.create({
    model: "qwen2.5:7b",
    messages: [{ role: "user", content: "Explain Schnorr signatures in 3 lines." }]
});
console.log(res.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://infernetprotocol.com/v1",
    api_key="no-key-needed-for-playground"
)

stream = client.chat.completions.create(
    model="qwen2.5:7b",
    messages=[{"role": "user", "content": "Who built Linux?"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

The browser-friendly version is at /chat. For the full route + parameter reference see docs → API.

Track 2

Run a node and earn

Infernet runs as a workload plugin on top of c0mpute (the p2p substrate — peer discovery, gossipsub auctions, shared toolchain). Two commands install both: the c0mpute installer brings in mise + bun + ffmpeg and lays down the c0mpute binary; the plugin installer drops the infernet CLI, sets up Ollama, opens the firewall port, and registers your node. Works on any Linux GPU box (RunPod, Vast, bare metal, or your home tower).

curl -fsSL https://c0mpute.com/install.sh | sh
c0mpute plugin install infernet

Then:

infernet init             # generates a Nostr keypair, picks defaults
infernet setup            # installs Ollama + a starter model + opens the firewall port
infernet register         # signs and announces your node to the network
infernet start            # starts the daemon (or use `infernet service enable`)

Your node is now live. The dashboard at /dashboard shows real-time status; configure where to send earnings:

infernet payout set --coin BTC --address bc1q...
infernet payout set --coin USDC --address 0x... --network arbitrum
infernet payout list

Quality-of-life commands:

infernet status                          # daemon health + last-seen
infernet model recommend --install-all   # auto-install the best models for your VRAM
infernet uncensored                      # one-shot install of Hermes 3 / Dolphin
infernet logs -f                         # tail the daemon log
infernet upgrade                         # pull the latest CLI

Full operator guide: Chapter 2 of the book.

Track 3

Train a custom model

Same shape as rockypod/svelte-coder — pick a topic, crawl the web, fine-tune, ship. Either run it on your one GPU or fan out across all nodes you own.

1. Crawl a query into a dataset

infernet train data \
    --query "svelte 5 framework documentation" \
    --domains svelte.dev,kit.svelte.dev,github.com \
    --num 30 \
    --out ./data/svelte5.jsonl

No search API key needed — the network proxies the crawl and enforces a per-node daily quota. Self-hosting? Pass --direct with a VALUESERP_API_KEY to bypass.

2. Scaffold a config

infernet train init --output ./run

Edit run/infernet.train.yml — at minimum set:

name: svelte5-coder
base_model: unsloth/Qwen2.5-Coder-7B-Instruct
method: qlora
runtime: unsloth
workload_class: C1            # C1 local · C2 sweep · C3 federated

input:
  dataset: ./data/svelte5.jsonl
  format: chatml

training:
  epochs: 3
  learning_rate: 2.0e-4
  batch_size: 4
  max_seq_len: 4096

lora:
  rank: 16
  alpha: 32
  target_modules: [q_proj, k_proj, v_proj, o_proj]

resources:
  min_vram_gb: 24

3a. Train locally (single GPU)

infernet train run --local --config ./run/infernet.train.yml

Needs Python 3.10+ with unsloth, datasets, trl. Install once:

pip install unsloth datasets trl

3b. Train on the open network (federated LoRA)

Pay any opted-in operator on the network — not just your own nodes — to train shards. Your local infernet daemon hosts the shards directly over its existing reachable port; no S3, no HuggingFace dataset, no IPFS, no third-party storage anywhere.

infernet train run --open-market \
    --config ./run/infernet.train.yml \
    --budget 5.00 \
    --max-nodes 8

What happens: the CLI splits the dataset into 8 shards under ~/.infernet/training-runs/<run_id>/shards/ and posts a job with your daemon's URL. Operators across the network with INFERNET_ACCEPT_TRAINING=1 poll the market every 60s, race-claim shards, fetch directly from your daemon, run Unsloth, and PUT the resulting adapter back. You FedAvg the 8 adapters when all shards report.

If your machine is behind NAT, run cloudflared tunnel --url http://localhost:8080 and set INFERNET_DAEMON_ENDPOINT to the cloudflared URL. Status: experimental — single-GPU local mode is the well-trodden path right now.

Output: ./run/checkpoint-final/ — a HuggingFace-shape directory ready to publish.

Track 4

Publish to HuggingFace and Ollama

One command, two destinations. The fine-tune lands at huggingface.co/<org>/<name> AND ollama.com/<user>/<name>.

infernet publish ./run/checkpoint-final \
    --hf InfernetProtocol/svelte5-coder \
    --ollama infernet/svelte5-coder \
    --quant q4_k_m

What this runs under the hood:

  1. huggingface-cli upload — pushes safetensors to HF
  2. convert_hf_to_gguf.py — converts to f16 GGUF (needs llama.cpp at ~/llama.cpp)
  3. llama-quantize — quantizes to Q4_K_M (or your --quant)
  4. Auto-generated Modelfile with the ChatML template
  5. ollama create + ollama push

One-time prereqs:

# llama.cpp for the GGUF convert
git clone https://github.com/ggml-org/llama.cpp ~/llama.cpp
cd ~/llama.cpp && cmake -B build && cmake --build build -j

# HF token with write scope
export HUGGINGFACE_TOKEN=hf_...

# Ollama signin (one time)
ollama signin

After publish, anyone with Ollama installed can pull your model:

ollama pull infernet/svelte5-coder
ollama run infernet/svelte5-coder "How do runes work in Svelte 5?"

Variants

  • --skip-hf — Ollama only
  • --skip-ollama — HuggingFace only
  • --modelfile-only — generate the Modelfile + GGUF locally, don't push anywhere