Voice Agents

The agent verb orchestrates a complete voice AI agent by wiring together three separate components — STT, LLM, and TTS — with integrated turn detection. Unlike the llm verb (which connects to speech-to-speech APIs where a single vendor handles everything), the agent verb lets you mix and match: for example, Deepgram for STT, Anthropic for the LLM, and Cartesia for TTS.

The agent verb manages the full conversational turn cycle:

User speaks → STT produces a transcript
Turn detection decides the user is done speaking
Transcript is sent to the LLM
LLM response tokens stream to TTS
TTS audio plays back to the caller
If the user barges in, TTS stops and a new turn begins

Looking for runnable examples? The jambonz/v10-examples repository has working demos for every feature described in this guide — basic usage, tool calling, MCP servers, CRM injection, persona switching, supervisor overrides, and more. Clone it and run any example end-to-end in minutes.

Basic Setup

The llm property is the only required field. STT and TTS will use your application’s default speech credentials if not specified.

Below is a minimal voice agent using the Node.js SDK and the application defaults for STT and TTS.

1 const http = require('node:http');
2 const { createEndpoint } = require('@jambonz/sdk/websocket');
3 
4 const envVars = {
5   OPENAI_MODEL: {
6     type: 'string',
7     description: 'OpenAI model to use',
8     default: 'gpt-4.1-mini',
9   },
10   SYSTEM_PROMPT: {
11     type: 'string',
12     description: 'System prompt for the voice agent',
13     uiHint: 'textarea',
14     default: [
15       'You are a helpful voice AI assistant.',
16       'The user is interacting with you via voice,',
17       'even if you perceive the conversation as text.',
18       'You eagerly assist users with their questions',
19       'by providing information from your extensive knowledge.',
20       'Your responses are concise, to the point,',
21       'and use natural spoken English with proper punctuation.',
22       'Never use markdown, bullet points, numbered lists,',
23       'emojis, asterisks, or any special formatting.',
24       'You are curious, friendly, and have a sense of humor.',
25       'When the conversation begins,',
26       'greet the user in a helpful and friendly manner.',
27     ].join(' '),
28   },
29 };
30 
31 const port = parseInt(process.env.PORT || '3000', 10);
32 const server = http.createServer();
33 const makeService = createEndpoint({ server, port, envVars });
34 const svc = makeService({ path: '/' });
35 
36 svc.on('session:new', (session) => {
37   console.log('session:new received', JSON.stringify({
38     call_sid: session.data.call_sid,
39     direction: session.data.direction,
40     from: session.data.from,
41     to: session.data.to,
42     env_vars: session.data.env_vars,
43   }, null, 2));
44 
45   try {
46     const model = session.data.env_vars?.OPENAI_MODEL || envVars.OPENAI_MODEL.default;
47     const systemPrompt = session.data.env_vars?.SYSTEM_PROMPT || envVars.SYSTEM_PROMPT.default;
48     console.log('using model:', model);
49 
50     session.on('/agent-event', (evt) => {
51       console.log('agent-event received:', evt.type);
52       if (evt.type === 'turn_end') {
53         const { transcript, response, interrupted, latency } = evt;
54         console.log('turn_end', JSON.stringify({ transcript, response, interrupted, latency }, null, 2));
55       }
56     });
57 
58     session.on('/agent-complete', () => {
59       console.log('agent-complete received, sending hangup');
60       session.hangup().reply();
61     });
62 
63     console.log('sending agent verb...');
64     session
65       .agent({
66         llm: {
67           vendor: 'openai',
68           model,
69           llmOptions: {
70             messages: [{ role: 'system', content: systemPrompt }],
71           },
72         },
73         turnDetection: 'krisp',
74         earlyGeneration: true,
75         bargeIn: { enable: true },
76         eventHook: '/agent-event',
77         actionHook: '/agent-complete',
78       })
79       .send();
80     console.log('agent verb sent');
81   } catch (err) {
82     console.error('Error in session:new handler:', err);
83   }
84 });
85 
86 svc.on('error', (err) => {
87   console.error('service error:', err);
88 });
89 
90 console.log(`voice agent listening on port ${port}`);

Supported LLM Vendors

The jambonz portal lets you “bring your own LLM” in a similar fashion to speech credentials. Configure credentials in Account → LLM Services, then reference the vendor in the agent verb.

11 LLM vendors are supported: Anthropic, AWS Bedrock, Azure OpenAI, Baseten, DeepSeek, Google AI Studio, Groq, HuggingFace, OpenAI, and Vertex AI (Gemini and Partner Models). See Bring Your Own LLM for per-vendor setup, model recommendations, and known issues, or the agent verb reference for the full vendor id + example-models table.

The agent verb normalizes message formats and tool schemas across vendors automatically. You write tools in OpenAI format and the agent verb adapts them for each vendor.

Authentication

By default, the agent verb uses speech credentials configured in the jambonz portal. You can also pass credentials directly:

1 llm: {
2   vendor: 'openai',
3   model: 'gpt-4.1-mini',
4   auth: { apiKey: process.env.OPENAI_API_KEY },
5   // ...
6 }

For AWS Bedrock, pass accessKeyId, secretAccessKey, and region in the auth object.

Turn Detection

The turnDetection property controls how the agent verb decides the user has finished speaking. We currently support only two modes — STT-based detection and Krisp’s turn detection model.

Self-Hosted Licensing

Krisp turn detection requires a separate Krisp API license on self-hosted deployments. See Krisp configuration for details. This does not apply to jambonz.cloud customers.

STT-based detection (default)

1 { "turnDetection": "stt" }

Uses the STT vendor’s native end-of-utterance signal. For most vendors this is silence-based. Some vendors have smarter built-in turn detection:

deepgramflux — Acoustic + semantic turn detection (Deepgram’s “Flux” model)
assemblyai — Native turn-taking with the u3-rt-pro model
speechmatics — Built-in turn detection

These vendors always use their native detection regardless of the turnDetection setting.

Krisp turn detection

1 {
2   "turnDetection": {
3     "mode": "krisp",
4     "threshold": 0.5
5   }
6 }

Uses the Krisp acoustic end-of-turn model, which analyzes speech patterns rather than just silence. Good for natural conversation where users pause mid-thought.

threshold — Confidence threshold from 0.0 to 1.0. Lower values trigger earlier turn transitions (more aggressive). Default: 0.5.
model — Optional Krisp model name override.

The shorthand "turnDetection": "krisp" uses default settings.

Krisp Licensing for Self-Hosted Systems

Krisp turn detection requires a Krisp API license key on self-hosted jambonz deployments. This license is not included with the jambonz software license and must be obtained separately from Krisp.

Contact support@jambonz.org for information on obtaining a Krisp license for your self-hosted deployment.

This requirement does not apply to jambonz.cloud customers — Krisp features are included with all jambonz.cloud plans.

Early Generation (Speculative Preflight)

Early generation speculatively sends the transcript to the LLM before end-of-turn is confirmed. If the transcript matches when the turn does end, buffered tokens are released immediately — shaving off the LLM prompt time. If the user keeps talking and the transcript changes, the speculative response is discarded.

There are two ways early generation is triggered:

Krisp turn detection — Set earlyGeneration: true to opt in. Krisp emits an early signal that triggers the speculative LLM prompt before final end-of-turn confirmation.
Deepgram Flux — Early generation happens automatically. Flux emits a native EagerEndOfTurn event that triggers preflight regardless of the earlyGeneration setting.

For other STT vendors with native turn-taking (assemblyai, speechmatics), early generation is not available.

1 session.agent({
2   turnDetection: 'krisp',
3   earlyGeneration: true,
4   // ...
5 }).send();

The turn_end event includes preflight metrics so you can track hit rates:

hit — speculative transcript matched final, tokens released immediately
miss — transcript changed, speculative response discarded
pending — preflight was still in progress when the turn ended

Barge-in Configuration

By default, users can interrupt the assistant while it’s speaking. The bargeIn object controls this behavior:

1 {
2   "bargeIn": {
3     "enable": true,
4     "minSpeechDuration": 0.5,
5     "sticky": false
6   }
7 }

enable — Allow interruptions. Default: true.
minSpeechDuration — Seconds of speech required to confirm an interruption. Prevents brief noises (coughs, background sounds) from cutting off the assistant. Default: 0.5.
sticky — If true, once the user interrupts, the assistant does not resume speaking the interrupted response. Default: false.

Tuning tips:

Lower minSpeechDuration (e.g., 0.2) for more responsive barge-in
Higher minSpeechDuration (e.g., 1.0) for noisy environments where false triggers are common
Set enable: false for scenarios where the assistant must complete its message (e.g., legal disclaimers)

No Response Timeout

The noResponseTimeout property handles the case where the user goes silent after the assistant finishes speaking.

1 { "noResponseTimeout": 12 }

When the timeout fires, the LLM is prompted with a system cue: “The user has not responded. Briefly check if they are still there or ask if they need help.” This generates a natural follow-up rather than leaving dead air.

Defaults to 12 seconds. Set to 0 to disable. The timer is cancelled if the user starts speaking.

This also covers the “missed speech” case: when VAD detects speech but STT returns no transcript, the no-response timer handles the re-prompt.

Greeting

By default (greeting: true), the agent verb prompts the LLM to generate an initial greeting before the user speaks. Set greeting: false if you want the agent to wait silently for the user to speak first.

Tool/Function Calling

The agent verb supports LLM tool/function calling, allowing your agent to perform actions like looking up data, calling APIs, or transferring calls. There are two ways to provide tools:

Roll your own — define a JSON schema, list it in llmOptions.tools, and handle the tool call yourself in a toolHook handler. Use this for tools specific to your application (CRM lookups, business logic, proprietary APIs).
Use pre-built tools from @jambonz/tools — drop in ready-made tools (web search, weather, Wikipedia, calculator, datetime) without writing schemas or handlers. Use this for common utility tools.

You can mix both approaches in the same agent — they share the same toolHook path.

Rolling your own tools

Use this approach for tools that are specific to your application. You supply the schema and handle the execution yourself.

Defining the tool schema

Define tools in llm.llmOptions.tools using the standard function-calling format:

1 const weatherTool = {
2   name: 'get_weather',
3   description: 'Get the current temperature and wind speed for a location.',
4   parameters: {
5     type: 'object',
6     properties: {
7       location: { type: 'string', description: 'City name, e.g. "Portland"' },
8       scale: { type: 'string', enum: ['celsius', 'fahrenheit'] },
9     },
10     required: ['location'],
11   },
12 };
13 
14 session.agent({
15   llm: {
16     vendor: 'openai',
17     model: 'gpt-4.1-mini',
18     llmOptions: {
19       messages: [{ role: 'system', content: 'You are a weather assistant.' }],
20       tools: [weatherTool],
21     },
22   },
23   toolHook: '/tool-call',
24   // ...
25 }).send();

The agent verb normalizes tool schemas across LLM vendors. You always define tools in the same format regardless of whether you’re using OpenAI, Anthropic, Google, or Bedrock.

Handling tool calls (WebSocket)

Tool calls arrive as events on the toolHook path with tool_call_id, name, and arguments (already parsed as an object). Respond with session.sendToolOutput():

1 session.on('/tool-call', async (evt) => {
2   const { tool_call_id, name, arguments: args } = evt;
3 
4   if (name === 'get_weather') {
5     try {
6       const geoRes = await fetch(
7         `https://geocoding-api.open-meteo.com/v1/search?name=${encodeURIComponent(args.location)}&count=1`
8       );
9       const geoData = await geoRes.json();
10       const { latitude, longitude } = geoData.results[0];
11 
12       const wxRes = await fetch(
13         `https://api.open-meteo.com/v1/forecast?latitude=${latitude}&longitude=${longitude}&current=temperature_2m,wind_speed_10m`
14       );
15       const weather = await wxRes.json();
16       session.sendToolOutput(tool_call_id,
17         `Temperature: ${weather.current.temperature_2m}°C, Wind: ${weather.current.wind_speed_10m} km/h`
18       );
19     } catch (err) {
20       session.sendToolOutput(tool_call_id, `Error: ${err.message}`);
21     }
22     return;
23   }
24 
25   session.sendToolOutput(tool_call_id, `Unknown tool: ${name}`);
26 });

Handling tool calls (Webhook)

In webhook mode, the tool call arrives as an HTTP POST to the toolHook URL. Return the tool result as JSON in the response body.

Using pre-built tools from `@jambonz/tools`

For common utility tools — web search, weather, Wikipedia, calculator, datetime — the @jambonz/tools package lets you skip the schema definition and handler code entirely. Each tool bundles a JSON Schema (for the LLM) and an execute() function (for your application) that are wired into your session with a single call.

@jambonz/tools is open source (MIT-licensed) and we actively welcome community contributions. If you’ve built a useful tool — a CRM lookup, a scheduling integration, a knowledge-base query — please consider opening a PR so other jambonz developers can use it. See the contributing guidelines in the repo README.

$ npm install @jambonz/tools

Available tools:

Tool	Factory	API key	Description
Web Search	`createTavilySearch`	Tavily	Search the web for current info
Weather	`createWeather`	none	Current weather for any location (Open-Meteo)
Wikipedia	`createWikipedia`	none	Factual summaries
Calculator	`createCalculator`	none	Safe math expression evaluator
Date & Time	`createDateTime`	none	Current date/time for any timezone

registerTools() wires the tools into your session — it listens on the toolHook path, dispatches each incoming tool call to the matching execute() function, and sends the result back via sendToolOutput():

1 import { createTavilySearch, createWeather, createCalculator, registerTools } from '@jambonz/tools';
2 
3 const search = createTavilySearch({ apiKey: process.env.TAVILY_API_KEY });
4 const weather = createWeather({ scale: 'fahrenheit' });
5 const calc = createCalculator();
6 const tools = [search, weather, calc];
7 
8 svc.on('session:new', (session) => {
9   registerTools(session, '/tool-call', tools);
10 
11   session.agent({
12     stt: { vendor: 'deepgram', language: 'multi' },
13     tts: { vendor: 'cartesia', voice: '9626c31c-bec5-4cca-baa8-f8ba9e84c8bc' },
14     llm: {
15       vendor: 'openai',
16       model: 'gpt-4.1-mini',
17       llmOptions: {
18         messages: [{
19           role: 'system',
20           content: 'You are a helpful voice assistant with web search, weather, and math tools. ' +
21             'Keep responses concise and conversational.',
22         }],
23         tools: tools.map((t) => t.schema),
24       },
25     },
26     toolHook: '/tool-call',
27     actionHook: '/agent-complete',
28   }).send();
29 });

registerTools() also accepts a logger option and returns errors to the LLM if a tool throws or is called with an unknown name.

Combining both approaches

You can mix pre-built tools from @jambonz/tools with your own custom tools in the same agent. Include the schemas from both in llmOptions.tools, use registerTools() for the pre-built ones, and attach your own toolHook handler for the custom ones. The two dispatch paths run side by side — registerTools() only handles tool calls whose name matches one it was given, so custom calls fall through to your handler.

1 const myTool = {
2   name: 'lookup_order',
3   description: 'Look up an order by ID',
4   parameters: {
5     type: 'object',
6     properties: { order_id: { type: 'string' } },
7     required: ['order_id'],
8   },
9 };
10 
11 // pre-built tools
12 registerTools(session, '/tool-call', [search, weather]);
13 
14 // custom tool handler — runs alongside registerTools
15 session.on('/tool-call', async (evt) => {
16   if (evt.name === 'lookup_order') {
17     const order = await db.orders.find(evt.arguments.order_id);
18     session.sendToolOutput(evt.tool_call_id, JSON.stringify(order));
19   }
20 });
21 
22 session.agent({
23   llm: {
24     vendor: 'openai',
25     model: 'gpt-4.1-mini',
26     llmOptions: {
27       tools: [search.schema, weather.schema, myTool],
28     },
29   },
30   toolHook: '/tool-call',
31   // ...
32 }).send();

You can also inject pre-built tools mid-conversation using updateAgent:

1 session.updateAgent({
2   type: 'update_tools',
3   tools: [search.schema, weather.schema],
4 });

MCP Server Integration

Instead of (or in addition to) defining tools inline, you can connect to external MCP servers. The agent verb connects to each server at startup via SSE or Streamable HTTP transport, discovers available tools, and makes them callable by the LLM.

1 session
2   .agent({
3     llm: {
4       vendor: 'openai',
5       model: 'gpt-4.1',
6       llmOptions: {
7         messages: [{
8           role: 'system',
9           content: 'You are a sports assistant. Use available tools to answer questions about live scores.',
10         }],
11       },
12     },
13     stt: { vendor: 'deepgram', language: 'en-US' },
14     tts: { vendor: 'cartesia', voice: 'sonic-english' },
15     mcpServers: [
16       { url: 'https://livescoremcp.com/sse' },
17     ],
18     actionHook: '/agent-complete',
19   })
20   .send();

A caller can simply ask “what football matches are on right now?” and the LLM will use the tools discovered from the MCP server to fetch real-time data — no need to define tool schemas in llmOptions.tools.

If an MCP server requires authentication:

1 {
2   "mcpServers": [
3     {
4       "url": "https://mcp.tavily.com/mcp/?tavilyApiKey=your-key",
5       "auth": { "apiKey": "your-key" }
6     }
7   ]
8 }

Tool dispatch priority: When the LLM requests a tool call, MCP servers are checked first. If the tool name matches one discovered from an MCP server, the call is dispatched there. Otherwise, it falls through to the toolHook webhook. You can use both together.

Mid-conversation Updates

The agent verb supports asynchronous updates while a conversation is in progress, allowing you to change the agent’s behavior, inject context, modify tools, or trigger responses — without interrupting the verb stack.

Updates are sent via WebSocket (session.updateAgent(data)) or REST API.

update_instructions

Replace the LLM system prompt mid-conversation. Useful for persona switching or topic transitions.

1 // After identifying the caller's intent, switch to a specialist persona
2 session.updateAgent({
3   type: 'update_instructions',
4   instructions: 'You are now a billing support agent. Help the caller with invoice questions.',
5 });

inject_context

Append messages to the LLM conversation history. System messages are routed to the system prompt for vendors that don’t support inline system messages (Bedrock, Anthropic, Google).

1 // Inject CRM data after identifying the caller
2 session.updateAgent({
3   type: 'inject_context',
4   messages: [
5     {
6       role: 'user',
7       content: 'CRM context: Customer name: Sarah Mitchell. Account tier: Gold. ' +
8         'Open support ticket: delayed delivery on the smart home hub.',
9     },
10   ],
11 });

update_tools

Replace the tool set available to the LLM. The new tools take effect on the next turn.

1 // Add web search capability after the user requests it
2 session.updateAgent({
3   type: 'update_tools',
4   tools: [
5     {
6       name: 'web_search',
7       description: 'Search the web for current information',
8       parameters: {
9         type: 'object',
10         properties: { query: { type: 'string' } },
11         required: ['query'],
12       },
13     },
14   ],
15 });

generate_reply

Prompt the LLM to generate a new response. If the agent verb is idle, the prompt executes immediately. If busy, the request is queued.

Use interrupt: true to cancel the current response and generate immediately — useful for supervisor overrides or urgent notifications.

1 // Supervisor whisper — interrupt with urgent info
2 session.updateAgent({
3   type: 'generate_reply',
4   interrupt: true,
5   user_input: 'URGENT: Tell the customer about the flash sale — 50% off all items for the next hour.',
6 });
7 
8 // Gentle prompt with one-shot instructions
9 session.updateAgent({
10   type: 'generate_reply',
11   user_input: 'Customer is asking about refunds',
12   instructions: 'Be empathetic and offer a 20% discount before processing a refund.',
13 });

Event Handling

The eventHook receives real-time events during the conversation. In WebSocket mode, listen with session.on():

1 session.on('/agent-event', (evt) => {
2   switch (evt.type) {
3     case 'user_transcript':
4       console.log('User said:', evt.transcript);
5       break;
6     case 'agent_response':
7       console.log('Agent replied:', evt.response);
8       break;
9     case 'user_interruption':
10       console.log('User interrupted');
11       break;
12     case 'turn_end':
13       console.log('Turn complete:', {
14         transcript: evt.transcript,
15         response: evt.response,
16         latency: evt.latency,
17       });
18       break;
19   }
20 });

Event Types

Event	Description	Key fields
`user_transcript`	User speech recognized	`transcript`
`agent_response`	Assistant reply text	`response`
`user_interruption`	User barged in	—
`turn_end`	End-of-turn summary	`transcript`, `confidence`, `response`, `interrupted`, `latency`, `tool_calls`
`history_summarized`	Conversation summarized	`turn`, `messages_dropped`, `messages_kept`, `summary`

turn_end Payload

The turn_end event is the most useful for observability. Example payload:

1 {
2   "type": "turn_end",
3   "transcript": "What's the weather in Portland?",
4   "confidence": 0.998,
5   "response": "The temperature in Portland is 52°F with wind at 12 km/h.",
6   "interrupted": false,
7   "latency": {
8     "stt_ms": 320,
9     "eot_ms": 180,
10     "llm_ms": 890,
11     "tool_ms": 420,
12     "tts_ms": 210,
13     "preflight": {
14       "result": "hit",
15       "tokens": 12
16     }
17   },
18   "tool_calls": [
19     { "name": "get_weather", "rtt_ms": 420 }
20   ]
21 }

Latency Optimization

The turn_end latency breakdown helps you identify bottlenecks and optimize response time.

Field	What it measures	How to optimize
`stt_ms`	STT processing time	Choose low-latency STT vendors (Deepgram). Use `hints` to improve accuracy.
`eot_ms`	End-of-turn detection wait	Tune Krisp `threshold` (lower = faster). Use vendors with native turn-taking.
`llm_ms`	Pure LLM thinking time (tool RTT subtracted)	Use faster models (e.g., `gpt-4.1-mini`). Keep system prompts concise. Enable `earlyGeneration`.
`tool_ms`	Total time in tool calls	Optimize tool endpoint latency. Use caching where appropriate.
`tts_ms`	TTS engine latency (text → first audio)	Choose streaming-capable TTS (Cartesia, ElevenLabs, Deepgram).
`preflight`	Speculative preflight result	Enable `earlyGeneration` with Krisp. Monitor hit rate — high miss rates may indicate the threshold is too aggressive.

Conversation History Summarization

For long conversations that might exceed the LLM’s context window, the agent verb can automatically summarize older turns.

Set the JAMBONES_PIPELINE_SUMMARIZE_TURNS environment variable to control how often summarization runs. Values 1–7 are clamped to 8. Set to 0 to disable (default).

When summarization triggers:

The LLM generates a concise summary of the older conversation turns
The summary is appended to the system prompt as a “Conversation context” section
The summarized turns are dropped from conversation history
Half the configured number of turns are kept in full fidelity

A history_summarized event is sent to the eventHook:

1 {
2   "type": "history_summarized",
3   "turn": 8,
4   "messages_dropped": 5,
5   "messages_kept": 6,
6   "summary": "The user is a software developer looking for a MacBook Pro..."
7 }

Noise Isolation

The noiseIsolation property enables server-side noise cancellation on call audio, improving STT accuracy in noisy environments.

Two vendors are available:

"krisp" — Krisp’s proprietary noise cancellation. Listen to audio samples to hear the model in action.
"rnnoise" — Open-source RNNoise-based noise cancellation. No API key required.

Krisp noise isolation has the same licensing requirement as Krisp turn detection — see the warning above for self-hosted licensing details.

Shorthand:

1 { "noiseIsolation": "krisp" }

Detailed configuration:

1 {
2   "noiseIsolation": {
3     "mode": "krisp",
4     "level": 80,
5     "direction": "read"
6   }
7 }

level — Suppression level 0–100. Higher values are more aggressive. Default: 100.
direction — "read" filters caller audio (default), "write" filters outbound audio.

Error Recovery

The agent verb handles errors gracefully to keep the conversation going:

LLM errors with tools — If the LLM fails and tools were included, the agent verb retries the same prompt without tools. This handles models that don’t support tool use in certain configurations.
Speculative preflight errors — When a speculative prompt fails, the preflight is discarded and a fresh prompt is issued normally.
Recovery to idle — On unrecoverable LLM errors, the agent verb ends the turn and transitions to idle so the user can continue speaking. The no-response timer is not started after an error to avoid retry loops.
STT reconnection — The agent verb automatically reconnects the STT stream if the connection drops.

When the agent verb encounters an unrecoverable error, it invokes the actionHook with a completion_reason indicating the failure.

Example Applications

The agent examples repository contains runnable demos for each feature:

Example	What it demonstrates
`deepgram-cartesia`	Basic agent with Deepgram STT + Cartesia TTS
`deepgramflux-elevenlabs`	Deepgram Flux (native turn detection) + ElevenLabs TTS
`speechmatics-rime`	Speechmatics STT + Rime TTS
`using-tools`	Tool calling with weather lookup
`web-search`	Web search via Tavily tool
`using-mcp-server`	MCP server for live sports scores
`tavily-mcp`	Web search via Tavily MCP server
`crm-injection`	Live CRM context injection via `inject_context`
`persona-switch`	Mid-conversation persona change via `update_instructions`
`supervisor-interrupt`	Urgent message injection via `generate_reply` with interrupt
`dynamic-tools`	Mid-conversation tool injection via `update_tools`