Groq
Configure jambonz to use Llama and Gemma models on Groq’s LPU hardware for sub-100ms TTFT.
Groq runs open-weight Llama 3.x/3.3 and Gemma models on custom LPU silicon and gets ~5-10× the tokens/sec of GPU-backed providers. For real-time voice agents, this is the most jambonz-shaped value prop in the LLM market — your agent feels noticeably more responsive than the same prompt on gpt-4o.
The tradeoff: Groq’s catalog is open-weight Llama / Gemma. Capable, but not GPT-5 / Claude-tier on hard reasoning. Pick Groq for “look up the order, read it back” voice agents; pick Anthropic / OpenAI / Bedrock for complex reasoning.
Get credentials
- Sign in at https://console.groq.com.
- Click API Keys in the sidebar (direct link: https://console.groq.com/keys).
- Click Create API Key, name it, copy the
gsk_...string.
Groq shows the key once at creation. Copy before navigating away.
Configure in jambonz
In the portal: Account → LLM Services → + Add LLM Service → Groq.
The gsk_... key from Groq’s console.
Defaults to https://api.groq.com/openai/v1. Override only for proxies — Groq’s production endpoint is the default.
Click Test to verify.
Use in an agent verb
Available Models
See Groq’s supported models page for the full live catalog (it churns frequently — preview models come and go). Common picks for voice agents:
Quirks & errors
Tool-calling reliability scales with model size. llama-3.3-70b-versatile handles multi-step tool calls cleanly. llama-3.1-8b-instant is faster and cheaper but occasionally fails to invoke tools when it should, or invokes them with malformed arguments. Test your specific tool definitions on both to find the right tradeoff.
Groq’s catalog churns frequently. Preview models come and go on a monthly cadence. The jambonz manifest ships a small curated set of stable models; for the full live catalog, see Groq’s models page or call listAvailableModels() programmatically.
Rate limits are strict on the free tier. Groq enforces tight per-minute and per-day limits if you haven’t added a payment method. For production traffic, upgrade at console.groq.com/settings/billing.
401 invalid_api_key — typo or revoked key. Regenerate at console.groq.com/keys.