Infernet Protocol

Getting started

Pick your path

Four ways to use Infernet Protocol. Each track is a copy-paste sequence; most flows finish in five minutes. Need depth? See the full docs or the book.

Track 1

Use the network — chat / completions

The public endpoint is OpenAI-compatible. If you've ever called OpenAI's API, you already know how to use it. Try it in 30 seconds:

curl

curl https://infernetprotocol.com/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "qwen2.5:7b",
        "messages": [{"role": "user", "content": "What is Bitcoin?"}],
        "stream": false
    }'

JavaScript / Node

import OpenAI from "openai";

const client = new OpenAI({
    baseURL: "https://infernetprotocol.com/v1",
    apiKey: process.env.INFERNET_API_KEY ?? "no-key-needed-for-playground"
});

const res = await client.chat.completions.create({
    model: "qwen2.5:7b",
    messages: [{ role: "user", content: "Explain Schnorr signatures in 3 lines." }]
});
console.log(res.choices[0].message.content);

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://infernetprotocol.com/v1",
    api_key="no-key-needed-for-playground"
)

stream = client.chat.completions.create(
    model="qwen2.5:7b",
    messages=[{"role": "user", "content": "Who built Linux?"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

The browser-friendly version is at /chat. For the full route + parameter reference see docs → API.

Track 2

Run a node and earn

One curl command installs the CLI, sets up Ollama, drops a systemd unit, and registers your node with the control plane. Works on any Linux GPU box (RunPod, Vast, bare metal, or your home tower).

curl -fsSL https://infernetprotocol.com/install.sh | sh

Then:

infernet init             # generates a Nostr keypair, picks defaults
infernet setup            # installs Ollama + a starter model + opens the firewall port
infernet register         # signs and announces your node to the network
infernet start            # starts the daemon (or use `infernet service enable`)

Your node is now live. The dashboard at /dashboard shows real-time status; configure where to send earnings:

infernet payout set --coin BTC --address bc1q...
infernet payout set --coin USDC --address 0x... --network arbitrum
infernet payout list

Quality-of-life commands:

infernet status                          # daemon health + last-seen
infernet model recommend --install-all   # auto-install the best models for your VRAM
infernet uncensored                      # one-shot install of Hermes 3 / Dolphin
infernet logs -f                         # tail the daemon log
infernet upgrade                         # pull the latest CLI

Full operator guide: Chapter 2 of the book.

Track 3

Train a custom model

Same shape as rockypod/svelte-coder — pick a topic, crawl the web, fine-tune, ship. Either run it on your one GPU or fan out across all nodes you own.

1. Crawl a query into a dataset

infernet train data \
    --query "svelte 5 framework documentation" \
    --domains svelte.dev,kit.svelte.dev,github.com \
    --num 30 \
    --out ./data/svelte5.jsonl

Needs VALUESERP_API_KEY in env or under integrations.valueserp.api_key in your config.

2. Scaffold a config

infernet train init --output ./run

Edit run/infernet.train.yml — at minimum set:

name: svelte5-coder
base_model: unsloth/Qwen2.5-Coder-7B-Instruct
method: qlora
runtime: unsloth
workload_class: C1            # C1 local · C2 sweep · C3 federated

input:
  dataset: ./data/svelte5.jsonl
  format: chatml

training:
  epochs: 3
  learning_rate: 2.0e-4
  batch_size: 4
  max_seq_len: 4096

lora:
  rank: 16
  alpha: 32
  target_modules: [q_proj, k_proj, v_proj, o_proj]

resources:
  min_vram_gb: 24

3a. Train locally (single GPU)

infernet train run --local --config ./run/infernet.train.yml

Needs Python 3.10+ with unsloth, datasets, trl. Install once:

pip install unsloth datasets trl

3b. Train on the open network (federated LoRA)

Pay any opted-in operator on the network — not just your own nodes — to train shards. Your local infernet daemon hosts the shards directly over its existing reachable port; no S3, no HuggingFace dataset, no IPFS, no third-party storage anywhere.

infernet train run --open-market \
    --config ./run/infernet.train.yml \
    --budget 5.00 \
    --max-nodes 8

What happens: the CLI splits the dataset into 8 shards under ~/.infernet/training-runs/<run_id>/shards/ and posts a job with your daemon's URL. Operators across the network with INFERNET_ACCEPT_TRAINING=1 poll the market every 60s, race-claim shards, fetch directly from your daemon, run Unsloth, and PUT the resulting adapter back. You FedAvg the 8 adapters when all shards report.

If your machine is behind NAT, run cloudflared tunnel --url http://localhost:8080 and set INFERNET_DAEMON_ENDPOINT to the cloudflared URL. Status: experimental — single-GPU local mode is the well-trodden path right now.

Output: ./run/checkpoint-final/ — a HuggingFace-shape directory ready to publish.

Track 4

Publish to HuggingFace and Ollama

One command, two destinations. The fine-tune lands at huggingface.co/<org>/<name> AND ollama.com/<user>/<name>.

infernet publish ./run/checkpoint-final \
    --hf InfernetProtocol/svelte5-coder \
    --ollama infernet/svelte5-coder \
    --quant q4_k_m

What this runs under the hood:

  1. huggingface-cli upload — pushes safetensors to HF
  2. convert_hf_to_gguf.py — converts to f16 GGUF (needs llama.cpp at ~/llama.cpp)
  3. llama-quantize — quantizes to Q4_K_M (or your --quant)
  4. Auto-generated Modelfile with the ChatML template
  5. ollama create + ollama push

One-time prereqs:

# llama.cpp for the GGUF convert
git clone https://github.com/ggml-org/llama.cpp ~/llama.cpp
cd ~/llama.cpp && cmake -B build && cmake --build build -j

# HF token with write scope
export HUGGINGFACE_TOKEN=hf_...

# Ollama signin (one time)
ollama signin

After publish, anyone with Ollama installed can pull your model:

ollama pull infernet/svelte5-coder
ollama run infernet/svelte5-coder "How do runes work in Svelte 5?"

Variants

  • --skip-hf — Ollama only
  • --skip-ollama — HuggingFace only
  • --modelfile-only — generate the Modelfile + GGUF locally, don't push anywhere