Baseten
Configure jambonz to use Baseten’s hosted open-weight model catalog or a dedicated Baseten deployment.
Configure jambonz to use Baseten’s hosted open-weight model catalog or a dedicated Baseten deployment.
Baseten hosts open-weight models (DeepSeek, GLM, Kimi, MiniMax, Nemotron, GPT-OSS, …) behind an OpenAI-compatible API. Two surfaces are supported:
https://inference.baseten.co/v1. A shared, curated catalog of hosted models. GET /v1/models returns the live list.https://bridge.baseten.co/v1/direct. Routes to a dedicated deployment you’ve spun up in your Baseten account.The wire is OpenAI-compatible (chat completions, streaming, tools), so the integration is straightforward.
In the portal: Account → LLM Services → + Add LLM Service → Baseten.
The key from Baseten’s dashboard.
Defaults to https://inference.baseten.co/v1 (Model APIs — the shared catalog). Set to https://bridge.baseten.co/v1/direct to route to a dedicated deployment, or to your own proxy endpoint.
Click Test to verify.
The current Model APIs catalog includes (curated subset — check /v1/models for the live list):
For dedicated deployments via the Bridge endpoint, the model value is whatever model id your deployment serves.
Two endpoints, one vendor: keep the vendor: 'baseten' constant — only the Base URL (in the portal credential) differs between the shared Model APIs catalog and a dedicated Bridge deployment. The same auth shape works for both.
Tool-calling capability varies by model. The wire format is OpenAI-compatible, but whether a given Baseten-hosted model actually executes tools depends on the underlying model’s training. DeepSeek V4 and GLM 5 generally do tools well; smaller models may be inconsistent. Test thoroughly.
401 — key revoked or copied with whitespace. Regenerate at app.baseten.co/settings/api_keys.