← All terms
Agents
Voice agent
A real-time conversational agent that speaks and listens.
Voice agents combine speech-to-text (Whisper, Deepgram), an LLM, and text-to-speech (ElevenLabs, Cartesia, OpenAI Voice) into a real-time loop. The hard parts are latency (sub-500ms end-to-end is the bar), interruption handling (the user starts talking mid-response), and turn detection. Frameworks like LiveKit Agents, Pipecat, and Vapi handle the orchestration; the LLM and prompt design are still your job.
Related terms
Building with Voice agent?
We ship production AI systems built around concepts like this every quarter. Send a brief and get a written proposal in 48 hours.
Send a brief →