For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
CommunitySign Up
HomeGuidesVerbsAPI ReferenceSelf-HostingClient SDKsTutorialsChangelog
HomeGuidesVerbsAPI ReferenceSelf-HostingClient SDKsTutorialsChangelog
  • Hosted applications
    • Ultravox
    • Retell
    • Call Forwarding
  • Voice AI Examples
    • AssemblyAI Voice Agent
    • Deepgram Voice Agent
    • Elevenlabs Conversational AI
    • Google Gemini Live
    • OpenAI Realtime API
    • Ultravox
  • Telephony integrations
    • 3CX
    • Vonage
    • WhatsApp
LogoLogo
CommunitySign Up
On this page
  • Authentication
  • Configuring the Assistant
  • Tool calls
  • actionHook properties
  • Resources
Voice AI Examples

AssemblyAI Voice Agent

Using jambonz to connect custom telephony to AssemblyAI's Voice Agent API
Was this page helpful?
Edit this page
Previous

Deepgram Voice Agent

Using jambonz to connect custom telephony to Deepgram's conversational AI
Next
Built with

The jambonz application referenced in this article can be found here.

This is an example jambonz application that connects to the AssemblyAI Voice Agent API and illustrates how to build a voice-AI application using jambonz and AssemblyAI. The application uses an open-meteo REST API to enable the agent to answer callers’ questions about the weather for specified locations.

Authentication

You’ll need an AssemblyAI API key with Voice Agent access. Configure it as a jambonz application environment variable in the portal (not via process.env):

VariableRequiredDescription
ASSEMBLYAI_API_KEYyesAssemblyAI API key (sent as Authorization: Bearer … on the voice-agent websocket). Mark as obscured.

The example application declares this variable via the SDK’s envVars option on createEndpoint, and reads it at call time from session.data.env_vars.ASSEMBLYAI_API_KEY. See Application Environment Variables in the Node.js SDK guide for the declaration pattern.

Configuring the Assistant

AssemblyAI’s protocol requires a session.update message to be sent before the agent will accept audio. The jambonz llm verb sends this automatically using whatever you pass in llmOptions — system prompt, greeting, output voice, input biasing, turn-detection thresholds, and tools.

llmOptions is the AssemblyAI session.update.session payload passed through verbatim:

1llmOptions: {
2 system_prompt: 'You are a helpful voice agent. Help callers get the weather for a city they ask about.',
3 greeting: 'Hello, how can I help you today?',
4 output: { voice: 'ivy' },
5 input: {
6 keyterms: ['weather', 'temperature', 'celsius', 'fahrenheit'],
7 turn_detection: {
8 vad_threshold: 0.5,
9 min_silence: 1000,
10 max_silence: 3000,
11 interrupt_response: true
12 }
13 },
14 tools: [ /* ... */ ]
15}

Audio format is not configurable. AssemblyAI Voice Agent only supports audio/pcm at 24 kHz, which jambonz uses unconditionally. The input.format and output.format keys are overridden by jambonz before the message is sent to AssemblyAI. jambonz resamples to/from the channel’s native rate automatically.

For the full list of session fields (voices, keyterm biasing, turn-detection knobs, etc.), refer to the AssemblyAI events reference.

Tool calls

The example application registers a getWeather tool that the agent can invoke to answer weather questions. Each tool entry must use AssemblyAI’s flat format:

1{
2 type: 'function',
3 name: 'getWeather',
4 description: 'Get current weather for a given city',
5 parameters: {
6 type: 'object',
7 properties: {
8 location: { type: 'string', description: 'City name' },
9 scale: { type: 'string', enum: ['celsius', 'fahrenheit'] }
10 },
11 required: ['location']
12 }
13}

When the agent decides to invoke a tool, jambonz fires a tool.call event and routes it to the configured toolHook. The handler replies via session.sendToolOutput(tool_call_id, {type: 'tool.result', tool_call_id, result}). The result field should be a string the model can read — jambonz JSON-stringifies non-string values automatically.

See AssemblyAI tool calling for the underlying protocol.

actionHook properties

Like many jambonz verbs, the llm verb sends an actionHook with a final status when the verb completes. The payload includes a completion_reason property indicating why the session ended. Possible values are:

  • normal conversation end
  • connection failure
  • disconnect from remote end
  • server error
  • client error calling function
  • client error calling mcp function

Resources

  • AssemblyAI Voice Agent API — product page
  • AssemblyAI Voice Agent API — documentation