Node.js SDK
Build jambonz voice applications with @jambonz/sdk
The @jambonz/sdk package is the recommended way to build jambonz voice applications in Node.js/TypeScript. It supports both webhook and WebSocket transports, a REST API client for mid-call control, and chainable verb methods for building call flows.
Source code: github.com/jambonz/node-sdk API reference: jambonz.github.io/node-sdk
This SDK replaces the older @jambonz/node-client (webhook) and @jambonz/node-client-ws (WebSocket) packages, which are now deprecated. The new SDK provides a unified package with a consistent API across both transports.
Which transport should I use? The WebSocket transport is recommended for most applications. It provides a persistent bidirectional connection that enables TTS streaming, mid-call updates, inject commands, real-time event handling, and voice AI features like the agent and llm verbs. The webhook transport is simpler but limited — use it for straightforward call routing scenarios (e.g., dial, basic IVR menus) where you don’t need real-time interaction.
Installation
Imports
The SDK provides three subpath exports:
Webhook Transport
Use WebhookResponse to build verb arrays in response to HTTP webhooks. Methods are chainable and the response is serialized to JSON.
WebSocket Transport
Use createEndpoint to build real-time WebSocket applications. This is the recommended transport for voice AI agents, as it enables bidirectional communication, event streaming, and mid-call updates.
.send() vs .reply()
.send()— Use once for the initial verb array in response tosession:new..reply()— Use for all subsequent responses to actionHook events.
This distinction is important: .send() starts the call flow, while .reply() continues it in response to events.
Application Environment Variables
You can declare environment variables that are configurable in the jambonz portal UI:
Audio Streams
When using the listen verb, makeService.audio() lets you handle both call control and audio on the same server:
The AudioStream object provides sendAudio(), playAudio(), killAudio(), disconnect(), sendMark(), and clearMarks() methods.
REST API Client
Use JambonzClient for outbound calls and mid-call control:
Verb Methods
Both WebhookResponse and WebSocket Session support the same chainable verb methods:
.say() .play() .gather() .dial() .llm() .agent() .conference() .enqueue() .dequeue() .hangup() .pause() .redirect() .config() .tag() .dtmf() .listen() .transcribe() .message() .dub() .alert() .answer() .leave() .sipDecline() .sipRefer() .sipRequest()
All methods accept the same options as the corresponding verb JSON schemas and are chainable.
TTS Token Streaming
The WebSocket Session provides methods for incremental TTS token streaming, enabling the lowest-latency voice AI experiences. This is used when you’re streaming tokens from an LLM and want them spoken as they arrive.
The isTtsPaused property indicates whether TTS streaming is paused due to backpressure.
TTS Streaming Events
LLM and Agent Updates
The Session provides methods for interacting with active LLM and agent conversations.
Tool Output
When the LLM requests a tool/function call (via the toolHook), respond with the result:
Agent Updates
Send mid-conversation updates to an active agent:
LLM Updates
Send updates to an active llm verb:
Inject Commands
Inject commands execute immediately on an active call without affecting the verb stack. They are useful for mid-call control actions like muting, recording, or whispering to one party on a bridged call.
The optional callSid parameter on inject methods targets a specific call leg on a bridged call. Omit it to target the current call.
Session Properties
Session Events
AI-Assisted Development
The @jambonz/mcp-schema-server package is an MCP server that gives AI coding assistants deep knowledge of jambonz APIs, verb schemas, and SDK patterns. Set it up so your AI can generate correct jambonz code automatically.
Remote server (simplest):
Local via npx:
For Cursor, VS Code, and other editors, see the setup instructions in the repository.
A complementary Agent Skill provides procedural knowledge about jambonz patterns and best practices:
Examples
See the examples directory for runnable demos:
For agent verb examples, see the agent examples.