§ P.04 · Memory

Persistent memory.
Opt-in. Per-wallet. Deletable.

Memory in Plumb is three things at once: an extraction pipeline, a vector store, and a composed profile. All three are scoped to your session wallet — no cross-user mixing, no hidden global embedding. Opt in per turn, retrieve what's stored, delete anything, at any time.

M.01

Opt in per turn, not per account

Pass extra_body={ plumb_memory: true } on any chat-completion and the gateway enqueues a BullMQ job on the memsync worker. The job runs an extractor model over the turn, gets back a list of { kind, content, confidence } facts, embeds each with your configured embedding model, and inserts them into memories with a pgvector column.

Turns without the flag write nothing. There is no default-on memory, no global fact mining. If you never send plumb_memory: true, your memories row count stays at zero.

M.02

Retrieval on later turns

On subsequent turns with memory enabled, the top-K nearest memories (by cosine distance) get injected into the system prompt before the upstream call. The retrieval hit is a millisecond-scale pgvector HNSW lookup. There's no external embedding API call on the retrieval side — the query embedding is computed once and reused.

M.03

The composed profile

/memsync/profile returns a denormalized JSON view of who the user is — composed deterministically from the memory rows by kind and confidence. Same inputs produce the same profile; the operator can't hide fields or tweak the composition without shipping new code.

M.04

Enable memory on a turn

example · python · chat with plumb_memory=true

from plumb_sdk import Client

c = Client(base_url="https://api.plumbtech.xyz", session_token=os.environ["PLUMB_SESSION"])

# Turn 1 -- extractor runs in background, rows written after response.
c.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "I'm a Go dev learning React for a side project"}],
    extra_body={"plumb_memory": True},
)

# Turn 2 -- retrieval injects prior memories into system prompt automatically.
resp = c.chat.completions.create(
    model="anthropic/claude-sonnet-4.5",
    messages=[{"role": "user", "content": "what should I focus on first?"}],
    extra_body={"plumb_memory": True},
)

# Inspect the composed profile -- same inputs, same output.
profile = c.memsync.profile()
print(profile.summary)  # e.g., "Go dev learning React, prefers X..."

M.05

Delete anything

Every memory has a stable UUID. DELETE /memsync/memories/:id is a hard delete — no soft delete, no archived table, no tombstone that can be re-hydrated. The console has a per-memory delete button with a confirmation; the Python SDK exposes client.memsync.delete(id).

GUIDE