Baseten

Configure jambonz to use Baseten’s hosted open-weight model catalog or a dedicated Baseten deployment.

Baseten hosts open-weight models (DeepSeek, GLM, Kimi, MiniMax, Nemotron, GPT-OSS, …) behind an OpenAI-compatible API. Two surfaces are supported:

  • Model APIs (default) — https://inference.baseten.co/v1. A shared, curated catalog of hosted models. GET /v1/models returns the live list.
  • Bridge / Directhttps://bridge.baseten.co/v1/direct. Routes to a dedicated deployment you’ve spun up in your Baseten account.

The wire is OpenAI-compatible (chat completions, streaming, tools), so the integration is straightforward.

Get credentials

  1. Sign in at https://app.baseten.co.
  2. Open Settings → API Keys (direct link).
  3. Create a key and copy the value.

Configure in jambonz

In the portal: Account → LLM Services → + Add LLM Service → Baseten.

API Key
stringRequired

The key from Baseten’s dashboard.

Base URL
string

Defaults to https://inference.baseten.co/v1 (Model APIs — the shared catalog). Set to https://bridge.baseten.co/v1/direct to route to a dedicated deployment, or to your own proxy endpoint.

Click Test to verify.

Use in an agent verb

1session.agent({
2 llm: {
3 vendor: 'baseten',
4 model: 'deepseek-ai/DeepSeek-V3.1',
5 llmOptions: {
6 systemPrompt: 'You are a helpful voice assistant.',
7 },
8 },
9 stt: { vendor: 'deepgram', language: 'en-US' },
10 tts: { vendor: 'cartesia', voice: 'sonic-english' },
11 turnDetection: 'krisp',
12 bargeIn: { enable: true },
13 actionHook: '/agent-complete',
14}).send();

Available models

The current Model APIs catalog includes (curated subset — check /v1/models for the live list):

Model idNotes
deepseek-ai/DeepSeek-V3.1DeepSeek V3.1, 163k context, tool-capable
deepseek-ai/DeepSeek-V4-ProNewer DeepSeek flagship
zai-org/GLM-4.7Zhipu GLM 4.7
zai-org/GLM-5Zhipu GLM 5
moonshotai/Kimi-K2.5Moonshot Kimi K2.5
moonshotai/Kimi-K2.6Moonshot Kimi K2.6
MiniMaxAI/MiniMax-M2.5MiniMax M2.5
nvidia/Nemotron-120B-A12BNVIDIA Nemotron Super
openai/gpt-oss-120bOpenAI’s open-weight 120B model

For dedicated deployments via the Bridge endpoint, the model value is whatever model id your deployment serves.

Quirks & errors

Two endpoints, one vendor: keep the vendor: 'baseten' constant — only the Base URL (in the portal credential) differs between the shared Model APIs catalog and a dedicated Bridge deployment. The same auth shape works for both.

Tool-calling capability varies by model. The wire format is OpenAI-compatible, but whether a given Baseten-hosted model actually executes tools depends on the underlying model’s training. DeepSeek V4 and GLM 5 generally do tools well; smaller models may be inconsistent. Test thoroughly.

401 — key revoked or copied with whitespace. Regenerate at app.baseten.co/settings/api_keys.