LiveKit Agents
Open-source framework for building realtime voice, video, and multimodal AI agents with LiveKit rooms, STT, LLMs, TTS, job scheduling, telephony, MCP tools, testing, and production deployment paths.
Open the source and read safety notes before installing.
Safety notes
- LiveKit Agents can join realtime rooms, hear users, speak back, call tools, exchange data with clients, and connect to telephony flows; treat it as production user-facing infrastructure.
- Telephony integrations can place or receive phone calls through LiveKit's SIP stack. Confirm consent, caller identity, recording rules, transfer behavior, emergency limitations, and local telecom requirements before enabling calling workflows.
- MCP support can attach external tools to a voice agent with little code, so restrict MCP servers, credentials, and tool scopes before allowing account, database, filesystem, browser, or infrastructure actions.
- Semantic turn detection, interruption handling, and realtime models can improve conversation quality, but they do not guarantee that an agent understood intent or handled sensitive situations correctly.
- Use the built-in tests, judges, staging rooms, rate limits, human escalation paths, and rollback plans before routing real customers or employees to a deployed agent.
Privacy notes
- Realtime audio, video, transcripts, chat messages, room metadata, participant identities, SIP call details, tool inputs, tool outputs, and generated replies may pass through LiveKit, model providers, plugin providers, MCP servers, and your own logs.
- Provider plugins for STT, LLM, TTS, realtime APIs, avatars, and telephony may send user data to separate third-party services with their own retention and privacy terms.
- Do not expose LIVEKIT_API_SECRET, provider keys, SIP credentials, room tokens, call recordings, or generated transcripts in prompts, public issues, committed configs, screenshots, or client bundles.
- If recording, transcribing, or storing conversations, define retention, deletion, access review, and user notification rules before launch.
- Use synthetic calls, demo rooms, and test identities when validating prompts, tools, provider plugins, and evals.
Prerequisites
- Python 3.10 or newer and an isolated project environment.
- LiveKit Cloud project or self-hosted LiveKit server with LIVEKIT_URL, LIVEKIT_API_KEY, and LIVEKIT_API_SECRET.
- Provider credentials for selected STT, LLM, TTS, realtime, avatar, or telephony integrations.
- Microphone, browser, mobile, SIP, or room client path for connecting end users to the agent.
- Consent, logging, retention, and escalation policy for realtime audio, video, transcripts, recordings, and phone calls.
Schema details
- Install type
- cli
- Troubleshooting
- No
- Scope
- Source repo
- Estimated setup
- 20 minutes
- Difficulty
- intermediate
- Pricing
- freemium
- Disclosure
- editorial
- Application category
- DeveloperApplication
- Operating system
- Web
Full copyable content
pip install "livekit-agents[openai,deepgram,cartesia]"About this resource
Overview
LiveKit Agents is an open-source framework for building realtime AI participants that run on servers and join LiveKit rooms. It is designed for voice, video, and multimodal agents that can listen, speak, see, call tools, exchange data with clients, and operate inside WebRTC or telephony sessions.
This is a strong fit when a Claude-adjacent team is building an actual realtime agent product instead of a chat-only workflow: voice support, phone support, client SDKs, media transport, job dispatch, testing, and deployment all matter at the same time.
Core Capabilities
| Area | LiveKit Agents Coverage |
|---|---|
| Realtime sessions | AgentSession, room connection, voice pipelines, interruption handling, and session lifecycle |
| Agent runtime | Agent, AgentServer, entrypoints, job scheduling, dispatch APIs, and production start modes |
| Models | STT, LLM, TTS, realtime model, VAD, and turn-detection integrations through LiveKit inference or provider plugins |
| Clients | WebRTC clients and LiveKit SDKs for browser, mobile, and other frontend surfaces |
| Telephony | SIP-based call flows for agents that can receive or place phone calls |
| Data exchange | RPC and data APIs for client-agent communication beyond voice |
| MCP | Native MCP support for connecting tools from MCP servers to agents |
| Testing | Native test framework, unit categories, behavioral evals, provider tests, and judge-based assertions |
Quick Start
Install the core package with the provider plugins you plan to use:
pip install "livekit-agents[openai,deepgram,cartesia]"
At minimum, a LiveKit-backed agent needs the LiveKit project/server connection:
export LIVEKIT_URL="wss://your-project.livekit.cloud"
export LIVEKIT_API_KEY="..."
export LIVEKIT_API_SECRET="..."
The repository documents local and production-oriented run modes:
python myagent.py console
python myagent.py dev
python myagent.py start
python myagent.py connect --room <room> --identity <id>
For AI coding agents, the upstream README recommends pairing:
- LiveKit Docs MCP server for current LiveKit docs and code examples.
- LiveKit Agent Skill for architecture guidance, handoffs, tasks, and testing patterns.
npx skills add livekit/agent-skills --skill livekit-agents
Example Shape
A minimal voice agent creates an AgentServer, starts an AgentSession, wires
STT, LLM, and TTS providers, and starts an agent in a LiveKit room:
from livekit.agents import Agent, AgentServer, AgentSession, JobContext, cli
from livekit.agents import function_tool, inference
@function_tool
async def lookup_weather(location: str):
return {"weather": "sunny", "temperature": 70}
server = AgentServer()
@server.rtc_session()
async def entrypoint(ctx: JobContext):
session = AgentSession(
vad=inference.VAD(),
stt=inference.STT("deepgram/nova-3", language="multi"),
llm=inference.LLM("openai/gpt-4.1-mini"),
tts=inference.TTS("cartesia/sonic-3"),
)
agent = Agent(
instructions="You are a helpful voice assistant.",
tools=[lookup_weather],
)
await session.start(agent=agent, room=ctx.room)
await session.generate_reply(instructions="Greet the user.")
if __name__ == "__main__":
cli.run_app(server)
Use Cases
- Customer support voice agents that can answer, look up account state, and escalate to a human.
- Internal helpdesk or operations agents that join a room and guide an employee through a task.
- AI phone systems that receive inbound calls or place approved outbound calls.
- Multimodal assistants that combine audio, video, screen state, and client data APIs.
- Voice-first product copilots that need browser, mobile, or embedded client frontends.
- Production agent experiments that need tests, evals, staging rooms, and provider-switching before launch.
Source Review
Verified on 2026-06-18:
- The upstream README describes LiveKit Agents as a framework for realtime, programmable participants that run on servers and build conversational, multimodal voice agents.
- The README lists STT, LLM, TTS, realtime API integrations, job scheduling, WebRTC clients, telephony, data APIs, semantic turn detection, native MCP support, testing, and open-source deployment paths.
- The README documents the core install command and required LiveKit environment variables for the example agent.
- The repository
AGENTS.mddocuments Python 3.10+,uv, unit/provider/eval test categories, and runtime commands for console, dev, start, and connect modes. - The docs entry page describes LiveKit Agents as a realtime framework for voice, video, and physical AI agents, with start, build, server lifecycle, telephony, model, and deployment guides.
- GitHub reports Apache-2.0 licensing for
livekit/agents, current active development, and a Python workspace with many provider plugins.
Duplicate Check
Checked current content/tools/, content/mcp/, content/skills/,
content/agents/, guides, open and closed pull requests, and repository-wide
content for LiveKit Agents, livekit/agents, livekit agents, voice AI agents, realtime voice agent, AgentSession, AgentServer, and
livekit-agents. Existing entries cover adjacent voice, MCP, framework, and
agent tooling, but no dedicated LiveKit Agents tools entry, LiveKit Agents source
URL duplicate, or open duplicate PR was found.
Disclosure
Editorial listing. No paid placement or affiliate link is used. LiveKit Agents is Apache-2.0 open source; LiveKit also offers commercial cloud infrastructure and related services.
Source citations
Add this badge to your README
How it compares
LiveKit Agents side by side with 2 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | LiveKit Agents Open-source framework for building realtime voice, video, and multimodal AI agents with LiveKit rooms, STT, LLMs, TTS, job scheduling, telephony, MCP tools, testing, and production deployment paths. Open dossier | mcp-agent Apache-2.0 Python framework for building MCP-native agents with composable workflow patterns, full MCP server lifecycle management, durable Temporal execution, agent-as-MCP-server support, and provider plugins for major LLMs. Open dossier | Pydantic AI Python agent framework from the Pydantic team for type-safe GenAI apps, tools, structured outputs, MCP, evals, and durable workflows. Open dossier |
|---|---|---|---|
| Trust | |||
| Install risk | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Category | tools | tools | tools |
| Source | source-backed | source-backed | source-backed |
| Author | LiveKit | LastMile AI | Pydantic |
| Added | 2026-06-18 | 2026-06-18 | 2026-06-03 |
| Platforms | CLI | CLI | CLI |
| Source repo | — | — | — |
| Safety notes | ✓LiveKit Agents can join realtime rooms, hear users, speak back, call tools, exchange data with clients, and connect to telephony flows; treat it as production user-facing infrastructure. Telephony integrations can place or receive phone calls through LiveKit's SIP stack. Confirm consent, caller identity, recording rules, transfer behavior, emergency limitations, and local telecom requirements before enabling calling workflows. MCP support can attach external tools to a voice agent with little code, so restrict MCP servers, credentials, and tool scopes before allowing account, database, filesystem, browser, or infrastructure actions. Semantic turn detection, interruption handling, and realtime models can improve conversation quality, but they do not guarantee that an agent understood intent or handled sensitive situations correctly. Use the built-in tests, judges, staging rooms, rate limits, human escalation paths, and rollback plans before routing real customers or employees to a deployed agent. | ✓mcp-agent manages MCP server lifecycles and can connect agents to filesystem, fetch, browser, SaaS, database, infrastructure, or custom MCP tools depending on configuration. Workflow patterns can chain, route, parallelize, evaluate, optimize, pause, resume, and recover agent actions; use explicit approval gates for high-impact tools. Agent-as-MCP-server deployment can expose an agent to other MCP clients, so review tool descriptions, permissions, authentication, rate limits, and operator visibility before sharing it. Durable workflows can continue after process restarts when backed by Temporal; make cancellation, rollback, retry, and idempotency behavior explicit. Do not let example filesystem, fetch, or remote MCP servers become production defaults without narrowing directories, URLs, accounts, and tool scopes. | ✓Pydantic AI type hints and output validation reduce classes of integration errors, but they do not prove an agent, model response, tool call, or generated workflow is correct or safe. Agents can call function tools, toolsets, provider-native tools, MCP servers, web search capabilities, external APIs, databases, and durable workflow backends; review tool side effects before enabling them. Tool names, docstrings, schemas, dynamic instructions, dependencies, previous messages, and MCP tool descriptions become model-facing context and should be treated as untrusted input surfaces. Human-in-the-loop approval, deferred tools, retries, and durable execution workflows need idempotency, timeout, rollback, and escalation policies before they are used for account, billing, data, or infrastructure actions. Evals, LLM judges, span-based evaluators, and Logfire dashboards are quality signals, not proof that an agent is safe, fair, compliant, or production-ready. Multi-agent, MCP, A2A, UI event stream, graph, and streaming-output workflows can create complex control flow; keep production permissions narrower than demo or notebook examples. |
| Privacy notes | ✓Realtime audio, video, transcripts, chat messages, room metadata, participant identities, SIP call details, tool inputs, tool outputs, and generated replies may pass through LiveKit, model providers, plugin providers, MCP servers, and your own logs. Provider plugins for STT, LLM, TTS, realtime APIs, avatars, and telephony may send user data to separate third-party services with their own retention and privacy terms. Do not expose LIVEKIT_API_SECRET, provider keys, SIP credentials, room tokens, call recordings, or generated transcripts in prompts, public issues, committed configs, screenshots, or client bundles. If recording, transcribing, or storing conversations, define retention, deletion, access review, and user notification rules before launch. Use synthetic calls, demo rooms, and test identities when validating prompts, tools, provider plugins, and evals. | ✓Prompts, instructions, tool arguments, MCP server outputs, workflow state, logs, traces, secrets YAML paths, provider responses, and durable execution history may be visible to model providers, MCP servers, observability systems, or Temporal. Keep provider keys, MCP credentials, filesystem paths, customer data, prompt logs, and traces out of committed configs, screenshots, public issues, and shared examples. If an agent uses external MCP servers, review each server's data retention, authentication, logging, and third-party data handling separately. Durable workflow state and logs can retain user requests, tool results, and intermediate reasoning context longer than a one-shot script. | ✓Pydantic AI runs can send prompts, instructions, chat history, dependency-derived context, tool arguments, tool results, structured outputs, retry prompts, and validation errors to configured model providers. Function tools and dependency injection can expose customer records, database values, API responses, internal identifiers, secrets, or proprietary business rules if those objects are made available to an agent. Pydantic Logfire, OpenTelemetry traces, eval reports, spans, metrics, cost tracking, and behavior monitoring can retain prompts, outputs, tool calls, metadata, errors, and performance data outside the application runtime. Pydantic Evals datasets, case metadata, expected outputs, human feedback, LLM-judge inputs, and report artifacts should follow normal retention, access-control, and deletion policies. MCP clients, MCP servers, native tools, and external toolsets can return third-party or workspace data into the conversation transcript, logs, traces, and evaluation outputs. |
| Prerequisites |
|
|
|
| Install | | | — |
| Config | — | — | — |
| Citations | |||
| Claim | Unclaimed | Unclaimed | Unclaimed |
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.