For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunitySign Up
HomeGuidesVerbsAPI ReferenceSelf-HostingClient SDKsTutorialsChangelog
HomeGuidesVerbsAPI ReferenceSelf-HostingClient SDKsTutorialsChangelog
  • Get Started
    • jambonz Overview
    • Developer Quickstart
    • Deployment Options
    • Support Plans
    • jambonz.cloud
  • Using the jambonz portal
  • Features
    • Voice Agents
      • Overview
      • Anthropic
      • AWS Bedrock
      • Azure OpenAI
      • Baseten
      • DeepSeek
      • Google (AI Studio)
      • Groq
      • HuggingFace
      • OpenAI
      • Vertex AI — Gemini
      • Vertex AI — Partner Models
    • Using OpenAI STT
    • Custom STT providers
    • Custom TTS providers
    • Answering machine detection
    • Conferencing "coach" mode
    • Continous ASR
    • Handling ActionHook Delays
    • Managing media anchors
    • Call Recording
    • SIPREC Server
    • TTS Streaming
    • Dub tracks
    • Filler Noise
    • Securing HTTP Endpoints
    • API Rate Limits
    • Application Environment Variables
LogoLogo
CommunitySign Up
On this page
  • Get credentials
  • Configure in jambonz
  • Use in an agent verb
  • Available models
  • Quirks & errors
FeaturesBring Your Own LLM

Baseten

Configure jambonz to use Baseten’s hosted open-weight model catalog or a dedicated Baseten deployment.

Was this page helpful?
Edit this page
Previous

DeepSeek

Configure jambonz to use DeepSeek's V4 family models.
Next
Built with

Baseten hosts open-weight models (DeepSeek, GLM, Kimi, MiniMax, Nemotron, GPT-OSS, …) behind an OpenAI-compatible API. Two surfaces are supported:

  • Model APIs (default) — https://inference.baseten.co/v1. A shared, curated catalog of hosted models. GET /v1/models returns the live list.
  • Bridge / Direct — https://bridge.baseten.co/v1/direct. Routes to a dedicated deployment you’ve spun up in your Baseten account.

The wire is OpenAI-compatible (chat completions, streaming, tools), so the integration is straightforward.

Get credentials

  1. Sign in at https://app.baseten.co.
  2. Open Settings → API Keys (direct link).
  3. Create a key and copy the value.

Configure in jambonz

In the portal: Account → LLM Services → + Add LLM Service → Baseten.

API Key
stringRequired

The key from Baseten’s dashboard.

Base URL
string

Defaults to https://inference.baseten.co/v1 (Model APIs — the shared catalog). Set to https://bridge.baseten.co/v1/direct to route to a dedicated deployment, or to your own proxy endpoint.

Click Test to verify.

Use in an agent verb

1session.agent({
2 llm: {
3 vendor: 'baseten',
4 model: 'deepseek-ai/DeepSeek-V3.1',
5 llmOptions: {
6 systemPrompt: 'You are a helpful voice assistant.',
7 },
8 },
9 stt: { vendor: 'deepgram', language: 'en-US' },
10 tts: { vendor: 'cartesia', voice: 'sonic-english' },
11 turnDetection: 'krisp',
12 bargeIn: { enable: true },
13 actionHook: '/agent-complete',
14}).send();

Available models

The current Model APIs catalog includes (curated subset — check /v1/models for the live list):

Model idNotes
deepseek-ai/DeepSeek-V3.1DeepSeek V3.1, 163k context, tool-capable
deepseek-ai/DeepSeek-V4-ProNewer DeepSeek flagship
zai-org/GLM-4.7Zhipu GLM 4.7
zai-org/GLM-5Zhipu GLM 5
moonshotai/Kimi-K2.5Moonshot Kimi K2.5
moonshotai/Kimi-K2.6Moonshot Kimi K2.6
MiniMaxAI/MiniMax-M2.5MiniMax M2.5
nvidia/Nemotron-120B-A12BNVIDIA Nemotron Super
openai/gpt-oss-120bOpenAI’s open-weight 120B model

For dedicated deployments via the Bridge endpoint, the model value is whatever model id your deployment serves.

Quirks & errors

Two endpoints, one vendor: keep the vendor: 'baseten' constant — only the Base URL (in the portal credential) differs between the shared Model APIs catalog and a dedicated Bridge deployment. The same auth shape works for both.

Tool-calling capability varies by model. The wire format is OpenAI-compatible, but whether a given Baseten-hosted model actually executes tools depends on the underlying model’s training. DeepSeek V4 and GLM 5 generally do tools well; smaller models may be inconsistent. Test thoroughly.

401 — key revoked or copied with whitespace. Regenerate at app.baseten.co/settings/api_keys.