Agents
An agent is the core unit in VOCALS. It defines how an AI voice assistant behaves on a call -- what it says, how it sounds, and which providers power it. Each phone number is assigned to one agent.
Creating an Agent
- Navigate to Agents in the dashboard.
- Click Create Agent.
- Give the agent a descriptive name (e.g., "Inbound Sales - English" or "Support Tier 1").
- Configure the settings described below.
- Click Save.
Agent Settings
System Prompt
The system prompt defines the agent's personality, instructions, and constraints. This is the most important setting -- it determines everything about how the agent behaves in conversation.
You are a friendly customer support agent for Acme Corp. Your name is Sarah.
Your responsibilities:
- Answer questions about our products and pricing
- Help customers troubleshoot common issues
- Escalate complex technical problems to a human agent
Rules:
- Never discuss competitor products
- Always confirm the customer's name before making account changes
- Keep responses concise -- aim for 1-2 sentences per turn
Best practices for system prompts:
- Be specific about response length. Phone conversations need short replies. Instruct the agent to respond in 1-2 sentences unless the caller asks for detail.
- Define a persona. Give the agent a name, tone, and personality. Callers engage better with a consistent character.
- Set boundaries. Explicitly state what the agent should and should not discuss. List topics to escalate to a human.
- Include example phrases. If you want the agent to use specific greetings, confirmations, or closings, include them in the prompt.
- Handle edge cases. Tell the agent what to do when it does not know the answer, when the caller is upset, or when the conversation goes off-topic.
- Keep it structured. Use sections with headers, bullet points, and numbered steps. LLMs follow structured prompts more reliably.
Avoid overly long system prompts. Each token in the system prompt adds latency and cost to every LLM call. Aim for 200-500 words. If you need more, consider whether some information could be retrieved dynamically via function calling instead.
Welcome Message
The first thing the agent says when a call connects. This plays as TTS audio before the agent starts listening.
Examples:
"Hello, thank you for calling Acme Corp. How can I help you today?""Hi there! This is Sarah from Acme support. What can I do for you?"
Leave this blank if you want the agent to wait for the caller to speak first (useful for outbound calls where the callee answers with "Hello?").
Language
The primary language for the conversation. This setting is passed to the STT provider to improve transcription accuracy. Set this to match the language your callers will speak.
Common values: en-US, en-GB, es-ES, es-MX, pt-BR, fr-FR, de-DE, zh-CN, ja-JP.
Barge-in Sensitivity
Controls how easily the caller can interrupt the agent while it is speaking. Barge-in (also called "interruption") stops the current TTS playback and processes what the caller said.
| Level | Behavior |
|---|---|
| Low | Caller must speak loudly or for a longer duration to interrupt. Reduces false triggers from background noise. |
| Medium | Balanced setting. Works well for most environments. |
| High | Agent stops speaking at the first sign of caller speech. Best for fast-paced conversations. |
If you notice the agent getting interrupted by background noise, lower the barge-in sensitivity. If callers complain the agent talks over them, raise it.
Interruptible
A boolean toggle that enables or disables barge-in entirely.
- Enabled (default): The caller can interrupt the agent mid-sentence.
- Disabled: The agent always finishes its current response before listening. Use this for prompts where the full message must be heard (e.g., legal disclaimers, compliance statements).
Max Call Duration
The maximum length of a call in seconds. When the limit is reached, the agent will say a configurable closing message and hang up.
- Default:
600(10 minutes) - Set lower for simple use cases (appointment confirmations, surveys) to control costs.
- Set higher for complex support calls that may need extended conversation.
Silence Threshold
How many seconds of silence before the agent prompts the caller or ends the call.
- Prompt threshold: After this many seconds of silence, the agent says something like "Are you still there?" (configurable in the system prompt).
- Hangup threshold: After extended silence with no response to the prompt, the call ends.
Default silence threshold is 5 seconds. Adjust based on your use case -- some callers need time to look up information, while others expect rapid back-and-forth.
Assigning Providers
Each agent needs one provider for each pipeline stage:
- In the agent configuration, find the Providers section.
- Select an STT Provider from your configured providers.
- Select an LLM Provider.
- Select a TTS Provider.
You can assign different providers to different agents. For example, your English sales agent might use Deepgram + GPT-4o + ElevenLabs, while your Chinese support agent uses Qwen + Kimi + ElevenLabs.
Changing providers on a live agent takes effect on the next call. Active calls continue using the providers that were configured when the call started.
Cloning an Agent
To create a variant of an existing agent:
- Open the agent you want to clone.
- Click Clone Agent.
- Modify the settings as needed (e.g., change the language, swap the TTS voice, adjust the system prompt).
- Save with a new name.
This is useful when you need similar agents for different languages or departments.
Testing an Agent
Before assigning an agent to a production phone number, test it:
- Open the agent configuration.
- Click Test Call.
- Use the web-based dialer to have a conversation with the agent.
- Review the call transcript and audio playback in the call log.
This lets you iterate on the system prompt and provider settings without consuming Twilio minutes.