§ P.02 · Hub

Content-addressed model hub.
Uploaders own the attribution.

The Hub gives every model a permanent, cryptographic address. Upload any ONNX, GGUF, or safetensors file and Plumb hashes the bytes with keccak256 — identical files produce identical hashes, divergent files cannot spoof each other. The hash is the only identifier the rest of the system needs.

A second step publishes the (hash, uploader) tuple to HubRegistry on Base Sepolia. Once registered, the uploader's address is the canonical attribution target for any inference run against that model — ready for revenue routing or provenance queries without trusting the gateway's database.

H.01

Upload + register (two endpoints, one flow)

/hub/upload accepts raw binary with a session token. The response is the keccak256 hash and on-disk storage URL. /hub/models/:hash/register (admin-gated) puts the tuple on-chain. Registration can happen immediately or later — the same hash can be re-registered idempotently.

example · curl · hub upload → register
# 1. upload raw bytes
curl -X POST https://api.plumbtech.xyz/hub/upload \
  -H "Authorization: Bearer $PLUMB_SESSION" \
  -H "Content-Type: application/octet-stream" \
  -H "X-Plumb-Model-Name: my-classifier" \
  -H "X-Plumb-Model-Framework: onnx" \
  --data-binary @my-classifier.onnx

# → {"hash":"0x7577…","sizeBytes":1023,"storageUrl":"plumb://0x7577…","framework":"onnx","name":"my-classifier"}

# 2. register on-chain (operator call — gateway submits the tx)
curl -X POST https://api.plumbtech.xyz/hub/models/0x7577…/register \
  -H "Authorization: Bearer $PLUMB_ADMIN_TOKEN"
# → {"hash":"0x7577…","txHash":"0xaf3…","metadataURI":"plumb://0x7577…","prevVersion":"0x000…"}
H.02

Inference against a Hub model

Once a model is registered, Plumb runs it locally via onnxruntime (for ONNX) or through the provider shim (for GGUF/safetensors via llama.cpp). The/hub/inference endpoint accepts the hash and input tensors; the response is dims + data, signed with the same receipt discipline as chat completions.

An LRU session cache keeps the most-used models warm in memory — cold load is 100-500ms for small models, repeat calls are sub-millisecond. There's a hard 30-second timeout per inference so an adversarial or stuck model can't stall the worker.

H.03

Versions, not tags

Unlike docker tags, Hub versions are a Merkle DAG: every registration carries a prevVersion pointing at the prior registration's hash (or bytes32(0) for the first version). Off-chain consumers walk the chain to see history; no tag is mutable, no version can be rewritten in place.

H.04

Size limits and storage backends

Default upload cap is 100 MB, configurable via PLUMB_HUB_MAX_UPLOAD_BYTES. Dev stores on the local filesystem at PLUMB_HUB_STORAGE_DIR; prod can swap to S3-compatible storage by implementing the StorageBackend interface. Uploads are DB-first: the gateway claims the row before writing bytes, so a crash between the two leaves no orphan files.

GUIDE

HubRegistry addresses + ABI
Python SDK hub client
Hub inference receipts