Llm
Parameters
Name of the LLM model.
Name of the LLM vendor.
Webhook that will be called when the LLM session ends.
Object containing authentication credentials; format according to the model.
Object containing information such as the URI to connect to.
Webhook that will be called when a requested LLM event happens (e.g., transcript).
Array of event names listing the events requested (wildcards allowed).
Object containing instructions for the LLM; format dependent on the LLM model.
Webhook that will be called when the LLM wants to call a function.
The following LLMs are currently supported:
- OpenAI Realtime API
- Deepgram Voice Agent
- Ultravox
- ElevenLabs
- Google Gemini Live API
- AssemblyAI Voice Agent
Google Gemini Live
Set vendor: 'google' and supply a Gemini Live model (for example models/gemini-2.0-flash-live-001 or models/gemini-3.1-flash-live-preview). llmOptions.setup is forwarded verbatim to Google’s BidiGenerateContentSetup message after the websocket connects.
Google-specific llmOptions fields
The BidiGenerateContentSetup object sent to Gemini right after the websocket connects. The model field is populated automatically from the verb’s model parameter. generationConfig.responseModalities is forced to audio.
Optional proactive greeting. When set, jambonz sends a text message to Gemini immediately after setup so the agent speaks first without waiting for the caller to speak. Accepts either a string or an object with a text field. The value is an instruction to the model, not the literal words — for example "Greet the caller warmly" rather than "Hello, how can I help?".
Implemented using realtimeInput.text so it works on both the 2.0 Live models and gemini-3.1-flash-live-preview. (On 3.1, clientContent is reserved for seeding history and does not trigger a model response, which is why realtimeInput.text is used.)
Enable session resumption. Pass {} to opt in, or { handle: "..." } to resume a previous session. Resumption handles are delivered back to the application via llm_event sessionResumptionUpdate messages.
AssemblyAI Voice Agent
Set vendor: 'assemblyai' and supply your AssemblyAI API key via auth.api_key. llmOptions is the AssemblyAI Voice Agent session payload passed through verbatim — there is no jambonz-specific wrapper. jambonz wraps it as {type: 'session.update', session: <llmOptions>} and sends it as the first client message after the websocket connects to wss://agents.assemblyai.com/v1/ws. See the AssemblyAI Voice Agent product page and Voice Agent API docs for an overview.
The audio format is not configurable. AssemblyAI Voice Agent only accepts audio/pcm at 24 kHz, which jambonz uses unconditionally — session.input.format / session.output.format set by the application are overridden. jambonz resamples to/from the channel’s native rate automatically.
AssemblyAI-specific auth fields
Your AssemblyAI API key. Sent as Authorization: Bearer <api_key> on the WebSocket handshake.
AssemblyAI-specific llmOptions fields
AssemblyAI’s protocol requires a session.update message, but every field inside is optional — pass llmOptions: {} to start with all server defaults.
System prompt for the agent.
Initial greeting the agent will speak when the session opens.
Output audio configuration. Supports voice — see the AssemblyAI voices reference for available IDs. The format sub-field is overridden by jambonz.
Input audio configuration. Supports keyterms (array of biasing terms) and turn_detection (vad_threshold, min_silence, max_silence, interrupt_response). The format sub-field is overridden by jambonz.
Array of tool definitions. Each entry must include type: "function", name, description, and parameters (JSON Schema). jambonz auto-fills type: "function" if omitted.
Tool calls
The agent invokes a tool by emitting a tool.call server event. jambonz routes it to the application’s toolHook with {name, args, tool_call_id}. The application replies via session.sendToolOutput(tool_call_id, {type: 'tool.result', tool_call_id, result}). The result should be a string (JSON-stringify objects before sending) — jambonz JSON-stringifies non-string result values automatically.
Example Applications
Please checkout the following example applications:
- for OpenAI
- for Deepgram
- for Ultravox
- for ElevenLabs
- for Google Gemini
- for AssemblyAI