← All terms
Agents

Voice agent

A real-time conversational agent that speaks and listens.

Voice agents combine speech-to-text (Whisper, Deepgram), an LLM, and text-to-speech (ElevenLabs, Cartesia, OpenAI Voice) into a real-time loop. The hard parts are latency (sub-500ms end-to-end is the bar), interruption handling (the user starts talking mid-response), and turn detection. Frameworks like LiveKit Agents, Pipecat, and Vapi handle the orchestration; the LLM and prompt design are still your job.

Related terms

Building with Voice agent?

We ship production AI systems built around concepts like this every quarter. Send a brief and get a written proposal in 48 hours.

Send a brief →