Major release
New Features
- Redesigned and simplifed the Carrier page in the portal, adding support for additional carrier authentication features.
- Add support for STT Latency metrics
- Add support for Deepgram Flux STT.
- Add support for Gladia STT.
- Add support for Soundhound STT.
- Add support for AssemblyAI v3 STT.
- Add support for Deepgram EU hosted endpoint.
- Add support for additinonal ElevenLabs hosted endpoints.
- Add support for Cartesia Sonic-3 streaming TTS model.
- Add support for Resemble TTS.
- Add additional languages and phrases to voicemail greeting file.
- Config verb can now be used to disable tts caching for the entire call.
- New
distributeDtmfproperty added to conference verb to enable DTMF distribution to all conference members. - Add support for tel scheme in referTo property of dial verb.
- Add new Alert verb
- Added CLI commands for managing feature server drainage, allowing administrators to manually take feature servers out of the rotation gracefully. Commands include fs active, fs drain, fs drained, fs undrain, and fs list.
Bug fixes
- A significant number of stability and performance improvements have been made in this release to improve overall system reliability.
- PR 1421 Fixed a timing issue in the gather verb where the timeout timer would not properly start after a bargein event occurred. This could cause calls to hang indefinitely waiting for user input instead of timing out as expected. The fix also removed an unnecessary “playDone” event emission that was no longer being used.
- PR 1415 Fixed an issue where whitespace-only tokens were being sent to the media server during TTS streaming. When whitespace was trimmed, incomplete commands lacking the required parameters would result in errors. The fix validates tokens before transmission and holds whitespace-only content for the next processing cycle.
- PR 1391 Fixed an issue where customerData was being lost when calls were transferred between feature servers. The fix ensures that customer context and metadata are preserved throughout the transfer process, allowing important customer information to remain intact during call routing.
- PR 1395 Resolved an issue where query string parameters were being inappropriately URL-encoded when they appeared as part of a filename in HTTP requests for audio file retrieval. This was causing URLs to become malformed and preventing audio files from being retrieved correctly.
- PR 1393 Fixed a race condition where the system would fail to send the final status callback or close the WebSocket application connection when a caller canceled during app JSON fetching. The fix ensures that CallSession properly cleans up resources when a call is canceled during the app-fetching phase.
- PR 1386 Fixed a timing issue where the continuous ASR (automatic speech recognition) timer was being initiated immediately upon starting to listen during background gather operations. The fix prevents the timer from starting prematurely in background gather scenarios, avoiding unintended behavior.
- PR 1383 Resolved a bug where transferring outbound conference participants between feature servers would fail. The system was checking the original call direction from Redis to determine whether to answer transfer requests, which caused failures for outbound calls. The fix ensures that transferred calls are always answered regardless of their original call direction.
- PR 1369 Improved error handling for TTS synthesis failures by ensuring that errors occurring during the initial TTS request are properly propagated as “SpeechCredentialError” events. Previously, TTS errors would fail silently without proper logging or call handling. The fix enables proper error reporting and allows applications to handle TTS failures appropriately.
- PR 1372 Added an event handler to properly respond when the Deepgram speech recognition service terminates its connection unexpectedly with an error condition. This ensures the system can gracefully handle remote closure events from Deepgram.
- PR 1366 Fixed an issue where the synthesized-audio verb was not properly sending status events when using text-to-speech streaming functionality. The fix ensures that status events are correctly generated and transmitted when TTS streaming is enabled, allowing clients to properly track the completion or status of audio synthesis operations.
- PR 1351 Fixed a timeout handling issue in the gather verb where the main timeout and ASR timeout were not being properly cleared when an interdigit timeout was triggered or when DTMF input took priority. This prevents conflicting timer behaviors during gather operations.
- PR 1359 Added exception handling around req.cancel() calls to prevent unhandled errors during request cancellation. This was particularly important for REST-based outdial operations where timing issues could cause exceptions to be thrown without proper catching, potentially causing crashes.
- PR 1358 Updated the speech_util dependency to version 0.2.23, bringing improvements to the speech processing pipeline.
- PR 1357 Fixed a bug where the singleDialer component was not properly initializing ConfirmCallSession with the necessary temporary file references. This was preventing proper cleanup and file management later in the call lifecycle, potentially causing issues with confirmation hook processing.
- PR 1356 Resolved an issue where ConfirmCallSession within placeCall lacked access to the tmpFiles variable, preventing proper cleanup of temporary files after operations completed. This fix prevents temporary file accumulation and resource leaks in the call session confirmation workflow.
- PR 1354 Corrected a misleading error log message that displayed “invalid command since id is missing” when a request actually lacked the tokens field. The log message now accurately reflects when the tokens field is missing, improving debugging clarity for developers.
- PR 1353 Added exception handling to prevent crashes when a SIP REFER message is received after a dial task has already concluded. This fix allows the system to gracefully manage this timing-related edge case rather than terminating unexpectedly.
- PR 1352 Fixed a security issue where the TTS streaming functionality was inadvertently exposing sensitive speech service credentials in logs or output. The fix ensures that authentication information remains protected and isn’t accidentally exposed through logs or debug output.
- PR 1349 Fixed an issue where Least Cost Routing (LCR) was being ignored in certain scenarios. The change prevents the dial and createCall functions from attempting to automatically select a carrier trunk when LCR is configured but no specific trunk is specified. This fix only affects accounts with active LCR configurations.
- PR 1344 Fixed an issue where the punctuation setting in the gather recognizer object was not functioning correctly when using Microsoft as the speech recognition vendor. When developers set “punctuation: false” in the recognizer configuration, the system was not removing punctuation marks from the recognized speech output as expected.
- PR 1331 Resolved a race condition in playback handling where the say task could receive stop events from previous cached file playbacks, causing improper playback state management. The fix shifts responsibility for generating playback IDs to the feature server and tracks the current playback ID to ensure only events corresponding to the current playback operation are processed.
- PR 1312 Fixed an issue where the timeout timer wasn’t being initiated when users bargeIn to a speech prompt by pressing DTMF digits. The fix ensures the timer starts automatically when DTMF input occurs during playback, while preserving existing behavior for normal scenarios where the timer starts at the end of say/play operations.
- PR 1320 Extended notification functionality for text-to-speech audio handling to send synthesized-audio notifications regardless of whether content originates from cache or vendor generation. Previously, the system only sent notifications when audio was freshly generated. The fix also returns an identifier that allows correlation between “say” verbs and their corresponding synthesized-audio events, enabling better traceability.
- PR 1315 Fixed a bug where the task.kill parameter was not being properly passed to the call state component, ensuring proper task termination handling.
- PR 1308 Enabled passing through options from the recogniser object in an AMD (Automated Message Detection) verb to the speech-to-text service. This allows users to leverage service-specific features such as custom models (e.g., custom Deepgram models) without modifying the core implementation.
- PR 1301 Fixed an issue where temporary audio files in a ConfirmCallSession were not being cleaned up when calls ended, causing resource leaks. The fix ensures that ConfirmCallSession uses the tmpFiles set of the parent CallSession to store references to created files, allowing the parent CallSession’s cleanup function to properly remove temporary files when the call terminates.
- PR 1300 Enabled the ability to pause and resume background listening functionality using silence or blank audio, enhancing the system’s ability to handle audio input states more flexibly during background processing operations.
- PR 1293 Fixed an issue with audio file caching where URLs containing query string parameters with periods (valid characters) were incorrectly parsed. The fix URL-encodes periods within query string parameters to %2E, allowing the system to correctly identify the file extension and enabling proper caching and playback of media files.
- PR 1290 Fixed failures when using Whisper with Play functionality by allowing a whisper to accept a single object verb (specifically “play”) without triggering unnecessary fetching operations. The fix also disables URL-based verb fetching during whisper operations to prevent failures.
- PR 1278 Implemented control mechanisms for forwarding the P-Asserted-Identity (PAI) header, enabling more granular control over how PAI information is propagated through the system.
- PR 1283 Fixed an issue where the stopTranscription method was incorrectly delaying transcription stops during gather verb operations when the JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS environment variable was enabled. The delayed shutdown prevented proper input capture in subsequent gather operations when transitioning between recognizers. The fix adds a gracefulShutdown: false parameter to stop transcription immediately without applying the configured delay.
- PR 1276 Fixed an issue where the gather task’s nested sayTask would not emit a playDone event when operating in streaming mode, preventing the transcribe task from starting and blocking the timeout timer. This fix ensures proper event emission during streaming playback, allowing the gather operation to proceed normally through its lifecycle stages and enabling timeout mechanisms to function when listenDuringPrompt is enabled.
- PR 1286 Fixed TTS cache issues including error handling gaps when TTS fails (where playback-start event doesn’t occur but playback-stop still fires), and concurrency race conditions with playback IDs. The fix implements atomic operations for ID generation and properly handles scenarios where playback-start lacks an ID.
- PR 1282 Fixed issues with LCC (Low Cost Calling) dial functionality when relative URLs are provided as action hooks. The PR also updated speech-utils to version 0.2.15 with configurable tmp folder location.
- PR 1279 Fixed a race condition related to cached audio playback where playback stop events from previous audio commands could incorrectly interfere with current playback operations. The solution validates that the playback ID in the “playback-stopped” event matches the ID from the corresponding “playback-start” event. Also improved TTS caching to respect the disableTtsCache setting.
- PR 1259 Fixed an issue where transcriptions were not being received when calls were terminated. The feature server was sending stopTranscription commands too quickly and destroying endpoints prematurely before transcription could be processed. The fix implements graceful shutdown for endpoints when JAMBONES_TRANSCRIBE_EP_DESTROY_DELAY_MS is enabled, delaying stopTranscription and endpoint destruction until transcription is received or timeout occurs. Excludes ASR fallback operations, paused transcription states, and AMD stop operations from the delay.
- PR 1271 Fixed an issue where the system would attempt to process missing or undefined data when a referHook function in a dial operation fails to return any payload. The fix now skips subsequent operations when no response is received, preventing errors from trying to work with empty or null values.
- PR 1269 Fixed TTS response code handling to ensure that a response code of 0 triggers task failure. The fix addresses compatibility with different TTS vendors (like Azure and Deepgram) that return different error codes, and improves error alerting when TTS errors occur by sending appropriate jambonz:error messages to webhooks rather than dropping calls.
- PR 1264 Fixed REFER (call transfer) handling in the dial functionality to ensure proper cleanup and termination of the dial task in the parent session when a REFER request is received on a parent call leg after the child call has been transferred. The fix includes a reversion of a problematic prior change to dial.js.
- PR 492 Fixed API authorization for account-level API keys to access SIP gateways and VoIP carriers. Previously, account-level API keys were unable to read or create these resources through the API endpoints. The fix grants proper permissions and automatically populates the service provider SID when accounts create carriers.
- PR 494 Fixed excessive CPU utilization during call recording caused by inefficient buffer handling in S3 multipart uploads. The previous implementation used Buffer.concat on every chunk, creating O(n²) complexity. The fix optimizes memory operations by accumulating chunks in an array and performing a single concatenation per 5 MB part, reducing complexity to O(n) and stabilizing request latency under concurrent load.
- PR 500 Fixed an issue where the system was unable to retrieve the list of available voices for Deepgram models.
- PR 505 Added the ability to completely disable rate limiting by setting the DISABLE_RATE_LIMITS environment variable to ‘true’ or ‘1’. This optimization is useful for deployments that handle rate limiting at a higher infrastructure level, eliminating unnecessary processing overhead from API-level rate limit calculations.
- PR 208 Increased DTMF signal volume levels in the session border controller to improve DTMF tone detection and reliability.
- PR 183 Fixed SIP reinvite handling to properly remove video SDP (Session Description Protocol) information during call renegotiation. This ensures that video codec and capability data is correctly filtered when calls are reinvited.
- PR 189 Fixed the isPrivateVoipNetwork function to correctly identify private network addresses in SIP URIs regardless of whether a trailing semicolon is present. Previously, URIs without a semicolon would incorrectly return false even when representing valid private network addresses.
- PR 113 Fixed an issue where the system was redirecting client calls to other SBCs using public IP addresses instead of private ones. The fix stores the private SIP address in Redis during client registration, enabling subsequent operations to route calls through private network paths.
- PR 110 Enhanced the database status API response to include expires value and timestamp fields for carrier information. The changes provide additional metadata about credential expiration and when status was recorded, along with improved logging for better visibility into the registration flow.
SQL changes
Availability
- Available now on jambonz.cloud
- Available now with devops scripts for support subscription customers
Questions? Contact us at support@jambonz.org
Point release
New Features
- Adds support for Cartesia Speech to text Ink-Whisper model. You can now use Cartesia for both TTS and STT.
- Adds support for creating an Agent Call on Ultravox. To enable this
feature, you must set the
agent_idproperty in the ultravox llm verb as described here. This is an optional property, and if not set the Create Call API will be used instead (i.e. legacy behavior). - The
sayverb now supports a TTS streaming mode where you can supply the full prompt at once in thetextproperty. - Adds additional support for Italian voicemail detection based on common operator messages.
- When registering with an outbound SIP trunk, use the account-level sip realm in the Contact header if provided.
Bug fixes
- error if app does not specify a speech synthesis voice issue.
- unhandled exception issue
- remove unnecessary logging PR
- embedded urls in
createCallREST call createCall verb caused parsing issue PR - unhandled exception in
dialverb PR - in certain dial scenarios, the A leg could be left connected after a successful REFER on the B leg PR
- remove video SDP when making outbound call PR
- fix issue with
dubverb whereloop: falsecaused the audio to incorrectly loop PR - fix potential looping behavior in background sticky bargeIn task PR
- fix snyk warning in drachtio-fsmrf PR
- route logs for jambonz-api-server to the correct log file PR
- creating new application in the webapp does not save a TTS voice by default PR
- fix issue when wild cards or regex is used in phone number for multiple carriers PR
SQL changes
None.
Availability
- Available now on jambonz.cloud
- Available now with devops scripts for subscription customers
Questions? Contact us at support@jambonz.org
Major release
New Features
- Adds support for Google Gemini speech-to-speech LLM. See example application here. Speech-to-speech LLMs now supported include: Gemini, Ultravox, OpenAI, Deepgram, and elevenlabs.
- Added MCP client support to the
llmverb. You can now specify an array of one or more MCP servers in themcpServersproperty of the llm verb and jambonz will query those MCP servers and automatically create tools for the LLM to call based on the tools exposed by each of the MCP servers. For an example, see the google gemini sample app. - Added support for application environment variables, which are special configuration variables that can be set in the jambonz portal for an application to customize the application behavior. This enables hosting of a single application that can then be customized for different customers without having to modify source code.
- Added support for Deepgram Aura-2 TTS model and voices
- Added support for Rime Arcana model
- Added support for PlayHT on-prem deployments.
- Added support for using outbound sip proxy when registering
- Added support for providing instructions to Whisper TTS
- Added new voice for nvidia TTS
Bug fixes
- Various stability fixes including for issues which caused intermittent Freeswitch crashes.
- Fixed deepgram gather cannot be timeout on empty transcription with continueAsr. PR
- Fixed say verb cannot failover if tts_response-code != 2xx. PR
- Fixed microsoft stt max client buffer size error for transcribe verb. PR
- sip_decline release callSession if ws requestor is used. PR
- Send stop-playback event. PR
- Fixed tts streaming buffer cannot reset timeout when lastUpdateTime is short. PR
- Fixed issue with Deepgram STT not returning transcript when last_word_end is -1. PR
- Fixed issue muting member in conference. PR
- Fixed API server crash when admin query voip-carrier. PR
- Fixed issue where we incorrectly saved an obscured API credential for recording, leading to failures authenticating. PR
- Fixed an issue where updateCall responding with 202 caused an error. PR
- Fixed an issue in the portal where the wrong recording bucket region was displayed. PR
SQL changes
Availability
- Available now on jambonz.cloud
- Available now with devops scripts for subscription customers
Questions? Contact us at support@jambonz.org
Point release
Elevenlab conversational AI bug fixes, readonly portal users and stability improvements
-
Fixes an issue where the initial client configuration message for Elevenlabs Conversational AI was improperly formatted.
PR -
Adds support for speed and pronunciation_dictionary_locators for Elevenlabs TTS.
PR -
Addresses memory allocation issue in freeswitch modules that could lead to intermittent crashes. (Fixed in freeswitch-modules@2.2.26).
-
Add support for throttling outbound registrations and disabling. Also added support for disabling outbound REGISTERs or NOTIFYs based on specific failure codes returned from the far end trunk.
PR, PR -
Fixes issue where confirm hook on a dial verb was not working over a websocket connection.
PR -
Adds support for creating portal users with readonly access.
PR -
Disable password managers (e.g. LastPass, etc) on some forms where they were incorrectly auto-filling data, leading to confusion over why the form was not submitting.
PR -
Fixes issue with failing re-INVITE due to unsupported codec.
PR -
Allows hangup verb to be used in a siprec call.
PR -
Fixes scenario where we have two config verbs, first config having hints, but second one not having hints, then the transcribe verb generating a rutime error.
PR -
Reject portal logins with better error message if a user that signed up using ouath tries to sign in using email and password.
PR -
Allow a readonly portal user to change their password.
PR
Point release
Add support for OpenAI Streaming STT and other improvements
-
Adds support for OpenAI Speech-to-text. Please see related options here and review this article a discussion of how to use the OpenAI STT prompt feature.
PR, PR, and PR. -
Support Cartesia sonic-2 and sonic-turbo models.
PR -
Fixes issue with use of streaming say in gather verb.
PR -
Better support for passing webrtc video calls.
PR -
Fixes issue when using language detection feature with Deepgram.
PR -
Fixes an issue showing incorrect speech synthesizer in applications view in the portal.
PR -
Write options ping failure alert once instead of repeatedly.
PR -
Fixes issue where lengthy LLM prompts for ultravox, elevenlabs, and deepgram were being truncated.
Point release
Additional log visibility, improvements to AMD, and more
-
Adds log viewer to jambonz portal (AWS only) to enable easier troubleshooting of calls.
PR, Issue -
Improves answering machine detection by listening for strings of digits in addition to other heuristics.
PR -
Add support for username and password authentication to redis.
PR -
Fixes crashing error with some media timeout scenarios
PR -
Adds support for pausing transcriptions on Listen and Transcribe verbs.
PR -
When a session uses live call control and a session:adulting message is sent to the application, customer data is now included.
PR -
Fixes an issue when a call is ended via the API live call control the call_terminated_by field is now ‘jambonz’.
PR -
Filters the carrier list by account when creating a new phone number.
PR -
Usability improvements when configure a websocket-based application URL in the jambonz portal.
PR -
Allows the Recent Calls API to return more than 25 calls at a time.
PR -
Smooth outbound SIP registrations to avoid spikes.
PR
Point release
Audio Improvements with Bidirectional Streams, Ultravox Enhancements, AWS Autoscaling fixes and more
-
Allows the
urlproperty in a listen verb to be a relative URL when used in a websocket application. This allows developers to create a single websocket app that handles both jambonz commands and bidirectional audio streams.
See this realtime translation example that uses openAI and bidirectional audio streams, where theurlproperty is a relative URL and the app handles both jambonz commands and the audio stream.
PR, Issue -
Fixes an intermittent issue with audio issue with crackling noise on bidirectional audio streams.
-
When an application redirects to a new absolute URL, update the base requestor so that future relative URLs are resolved relative to the new URL.
PR, Issue -
Fixes an issue where the final transcript in a conversation initiated with the dial verb was sometimes not collected if the caller hung up quickly after their final utterance.
PR, Issue -
Adds support for sending an input_text_message to Ultravox.ai during a speech-to-speech session. This enables the application to dynamically direct the conversation through means other than the caller’s voice.
PR -
Fixes an issue with intermittent failure to clean up media server resources after a call completes.
PR Issue -
Webapp no longer shows Messaging webhook as SMPP is a deprecated feature for the time being (lack of customer demand).
PR, Issue -
Fixes database upgrade script which had previously misnamed a column.
PR Issue -
Fixes an issue with AWS autoscaling where incorrect SNS topic name was used, leading to unnecessarily long scale-in durations.
PR -
When sending a REFER over sips the Contact header should also use sips scheme.
PR
Point release
Conferencing Enhancements and Minor Fixes
-
Adds support for receiving sip requests during a conference call.
PR, Issue -
Sends new error message over websocket to application when an incoming request from the application is not valid.
PR Issue -
Fixes a typo with the variable name used to store the AWS SNS topic arn (only relevant for AWS deployments).
PR
Point release
Improve Ultravox Integration
-
Adds support for sending the Ultravox call identifier to the jambonz app so that it can be used for tracking and troubleshooting purposes.
PR -
Update to drachtio-srf 5.0.2
Point release
Important STT Improvements for Deepgram and Speechmatics, Support for Outbound SIP Proxy and more
-
Adds support for carriers that require us to send them calls through an outbound sip proxy.
PR, PR PR, Issue -
Reject call attempts on hosted jambonz systems where the account has no active subscription.
PR -
Improve Deepgram integration by ignoring UtteranceEnd event from Deepgram when we have unprocessed words; in this scenario it is better to continue to wait for the unprocessed words to become finalized.
PR Issue -
Fixes issue where exception thrown when a new application URL is provided during a call and the request to that URL fails.
PR -
Improves handling of errors when handling
tts:tokensrequests to stream text tokens from an LLM.
PR -
Fixes timeout issues when working with speechmatics STT.
PR -
Adds a response time metric when using tts streaming.
PR -
Fixes an issue where when using the
dialMusicproperty in a dial verb, the music could play endlessly.
PR, Issue -
Allows the Deepgram nodelay property to be explicitly set.
PR, Issue -
Enhances the createCall REST API to allow the caller to specify a sip proxy to send the INVITE through.
PR -
Fixes issue where an incoming REGISTER with invalid sip uri in the From or To header causes an exception.
PR, Issue -
Supports recording an incoming SIPREC call using the jambonz recording feature.
PR