AssemblyAI Voice Agent
The jambonz application referenced in this article can be found here.
This is an example jambonz application that connects to the AssemblyAI Voice Agent API and illustrates how to build a voice-AI application using jambonz and AssemblyAI. The application uses an open-meteo REST API to enable the agent to answer callers’ questions about the weather for specified locations.
Authentication
You’ll need an AssemblyAI API key with Voice Agent access. Configure it as a jambonz application environment variable in the portal (not via process.env):
The example application declares this variable via the SDK’s envVars option on createEndpoint, and reads it at call time from session.data.env_vars.ASSEMBLYAI_API_KEY. See Application Environment Variables in the Node.js SDK guide for the declaration pattern.
Configuring the Assistant
AssemblyAI’s protocol requires a session.update message to be sent before the agent will accept audio. The jambonz llm verb sends this automatically using whatever you pass in llmOptions — system prompt, greeting, output voice, input biasing, turn-detection thresholds, and tools.
llmOptions is the AssemblyAI session.update.session payload passed through verbatim:
Audio format is not configurable. AssemblyAI Voice Agent only supports audio/pcm at 24 kHz, which jambonz uses unconditionally. The input.format and output.format keys are overridden by jambonz before the message is sent to AssemblyAI. jambonz resamples to/from the channel’s native rate automatically.
For the full list of session fields (voices, keyterm biasing, turn-detection knobs, etc.), refer to the AssemblyAI events reference.
Tool calls
The example application registers a getWeather tool that the agent can invoke to answer weather questions. Each tool entry must use AssemblyAI’s flat format:
When the agent decides to invoke a tool, jambonz fires a tool.call event and routes it to the configured toolHook. The handler replies via session.sendToolOutput(tool_call_id, {type: 'tool.result', tool_call_id, result}). The result field should be a string the model can read — jambonz JSON-stringifies non-string values automatically.
See AssemblyAI tool calling for the underlying protocol.
actionHook properties
Like many jambonz verbs, the llm verb sends an actionHook with a final status when the verb completes. The payload includes a completion_reason property indicating why the session ended. Possible values are:
normal conversation endconnection failuredisconnect from remote endserver errorclient error calling functionclient error calling mcp function