Baseten
Configure jambonz to use Baseten’s hosted open-weight model catalog or a dedicated Baseten deployment.
Baseten hosts open-weight models (DeepSeek, GLM, Kimi, MiniMax, Nemotron, GPT-OSS, …) behind an OpenAI-compatible API. Two surfaces are supported:
- Model APIs (default) —
https://inference.baseten.co/v1. A shared, curated catalog of hosted models.GET /v1/modelsreturns the live list. - Bridge / Direct —
https://bridge.baseten.co/v1/direct. Routes to a dedicated deployment you’ve spun up in your Baseten account.
The wire is OpenAI-compatible (chat completions, streaming, tools), so the integration is straightforward.
Get credentials
- Sign in at https://app.baseten.co.
- Open Settings → API Keys (direct link).
- Create a key and copy the value.
Configure in jambonz
In the portal: Account → LLM Services → + Add LLM Service → Baseten.
The key from Baseten’s dashboard.
Defaults to https://inference.baseten.co/v1 (Model APIs — the shared catalog). Set to https://bridge.baseten.co/v1/direct to route to a dedicated deployment, or to your own proxy endpoint.
Click Test to verify.
Use in an agent verb
Available models
The current Model APIs catalog includes (curated subset — check /v1/models for the live list):
For dedicated deployments via the Bridge endpoint, the model value is whatever model id your deployment serves.
Quirks & errors
Two endpoints, one vendor: keep the vendor: 'baseten' constant — only the Base URL (in the portal credential) differs between the shared Model APIs catalog and a dedicated Bridge deployment. The same auth shape works for both.
Tool-calling capability varies by model. The wire format is OpenAI-compatible, but whether a given Baseten-hosted model actually executes tools depends on the underlying model’s training. DeepSeek V4 and GLM 5 generally do tools well; smaller models may be inconsistent. Test thoroughly.
401 — key revoked or copied with whitespace. Regenerate at app.baseten.co/settings/api_keys.