Using OpenAI STT
Taking advantage of OpenAI’s prompt feature with jambonz
jambonz supports a wide range of speech recognition vendors, and when we add support for new speech vendor we try to support and expose all of their options so that you can fully utilize their capabilities.
OpenAI is rather unique in that it supports a prompt feature that allows you to pass in a custom prompt to help guide the recognizer.
This is something we have been asking STT vendors for a while.
In this article we explore the different ways to exploit the prompt feature of OpenAI STT.
To begin with, here are the possible options that you use with OpenAI STT:
In this article we want to explore the various ways to construct a prompt for OpenAI STT.
Providing hints
To start with the simplest method, if you provide hints and you are using ‘whisper-1’ as the model, then the hints will simply be used as the prompt.
The reason for this is that the ‘whisper-1’ model supports a limited number of tokens in the prompt so it is recommended to simply use the hints as the prompt. Note that this is the default behavior, but if you specify either ‘prompt’ or ‘promptTemplates’ then the prompt will be generated from those settings and not the hints.
Using the prompt setting
You can also simply provide the prompt using the prompt setting.
Using promptTemplates
A further option is to use the promptTemplates options. These give you the ability to provide a template that is interpolated with either the hints or the conversation history to create a final prompt.
or, using conversation history:
By default, the conversation history is limited to the last 4 turns you can adjust this as well.
Note that you can provide both hintsTemplate and conversationHistoryTemplate and the final prompt will concatenate the two interpolated strings.