Synthesizer
A property that can be used in a say
verb to override the application default TTS settings.
Parameters
Name of the TTS vendor, or ‘default’ to use the application default.
Name of the voice to use for the TTS, or ‘default’ to use the application default.
(Google specific) may be standard
, neural
, generative
, or long-form
.
(Google specific) may be MALE
, FEMALE
, NEUTRAL
.
Label associated with the TTS vendor.
Language code for the TTS, or ‘default’ to use the application default.
Vendor-specific options for the TTS, see below for supported properties.
Vendor-specific options
cartesia
embedding
or id
(see Cartesia docs)
a voice embedding (see Cartesia docs)
specifies emotion (see Cartesia docs)
A number or named specifier (e.g “slow”) (see Cartesia docs)
elevenlabs
Defines the stability for voice settings (see Elevenlabs docs)
Defines the similarity boost for voice settings. (see Elevenlabs docs)
Defines the use speaker boost for voice settings. This parameter is available on V2+ models (see Elevenlabs docs)
Defines the style for voice settings. This parameter is available on V2+ models. (see Elevenlabs docs)
Identifier of the model that will be used (see Elevenlabs docs)
playht
The voice engine used to synthesize the voice. (see Playht docs)
draft, low, medium, high, premium (see Playht docs)
An integer number greater than or equal to 0. If equal to null or not provided, a random seed will be used. Useful to control the reproducibility of the generated audio. Assuming all other properties didn’t change, a fixed seed should always generate the exact same audio file (see Playht docs)
A floating point number between 0, inclusive, and 2, inclusive. If equal to null or not provided, the model’s default temperature will be used. The temperature parameter controls variance. Lower temperatures result in more predictable results, higher temperatures allow each run to vary more, so the voice may sound less like the baseline voice. (see Playht docs)
An emotion to be applied to the speech. Only supported when voice_engine is set to Play3.0-mini, PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engine. (see Playht docs)
A number between 1 and 6. Use lower numbers to reduce how unique your chosen voice will be compared to other voices. Higher numbers will maximize its individuality. Only supported when voice_engine is set to Play3.0-mini, PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engine. (see Playht docs)
A number between 1 and 30. Use lower numbers to to reduce how strong your chosen emotion will be. Higher numbers will create a very emotional performance. Only supported when voice_engine is set to Play3.0-mini, PlayHT2.0 or PlayHT2.0-turbo, and voice uses that engine. (see Playht docs)
A number between 1 and 2. This number influences how closely the generated speech adheres to the input text. Use lower values to create more fluid speech, but with a higher chance of deviating from the input text. Higher numbers will make the generated speech more accurate to the input text, ensuring that the words spoken align closely with the provided text. Only supported when voice_engine is set to Play3.0-mini or PlayHT2.0, and voice uses that engine. (see Playht docs)
rimelabs
When set to true, adds pauses between words enclosed in angle brackets. The number inside the brackets specifies the pause duration in milliseconds. Example: “Hi. <200> I’d love to have a conversation with you.” adds a 200ms pause between the first and second sentences. see Rimelabs docs
When set to true, you can specify the phonemes for a word enclosed in curly brackets. see Rimelabs docs
Comma-separated list of speed values applied to words in square brackets. Values < 1.0 speed up speech, > 1.0 slow it down. Example: “This sentence is [really] [fast]” with inlineSpeedAlpha “0.5, 3” will make “really” slow and “fast” fast. see Rimelabs docs
Adjusts the speed of speech. Lower than 1.0 is faster than default. Higher than 1.0 is slower than default. see Rimelabs docs
Reduces the latency of response, at the cost of some possible mispronunciation of digits and abbreviations. see Rimelabs docs
verbio
The engine version to use. (see Verbio docs)
whisper
TTS model to use. (see Whisper docs)