Skip to main content
If you run livekit-agents, you know the shape: an AgentServer, one @server.rtc_session per agent, an AgentSession you wire up inside it, and cli.run_app. It works, but every agent is its own worker process, prewarm reloads in each one, and shipping a code change means restarting the worker and dropping in-flight calls. OpenRTC is a thin layer on top of the same SDK. Your Agent subclasses do not change. You register them on one AgentPool and host them all in a single worker, and you get density, hot reload, per-tenant isolation, and zero-downtime deploys on top. Think of it as livekit-agents with the operational parts filled in.

The one change that matters

from livekit import agents
from livekit.agents import AgentServer, AgentSession, Agent, inference, TurnHandlingOptions


class SupportAgent(Agent):
    def __init__(self) -> None:
        super().__init__(instructions="Help callers with support questions.")


server = AgentServer()


@server.rtc_session(agent_name="support")
async def support(ctx: agents.JobContext):
    session = AgentSession(
        stt=inference.STT(model="deepgram/nova-3"),
        llm=inference.LLM(model="openai/chat-latest"),
        tts=inference.TTS(model="cartesia/sonic-3"),
        turn_handling=TurnHandlingOptions(turn_detection=inference.TurnDetector()),
    )
    await session.start(room=ctx.room, agent=SupportAgent())
    await session.generate_reply(instructions="Greet the caller.")


# A second agent means a second rtc_session (and, to scale, a second worker).

if __name__ == "__main__":
    agents.cli.run_app(server)
The Agent subclass is identical. You write the session wiring once as pool defaults, adding an agent is a single pool.add(...), every agent shares one prewarm, and routing is automatic.

What you keep

Everything about writing an agent. OpenRTC never introduces a base class and never sits between you and the SDK:
  • Your Agent subclasses, instructions, and state.
  • @function_tool, RunContext, on_enter / on_exit, and the *_node hooks.
  • inference.STT/LLM/TTS and plugin provider objects, passed through unchanged.
  • LIVEKIT_URL / LIVEKIT_API_KEY / LIVEKIT_API_SECRET and the same dispatch model.
  • The CLI verbs you know: openrtc dev / start / console / connect mirror livekit’s.

What you delete

The per-worker boilerplate that OpenRTC now owns for you:
  • The per-agent @server.rtc_session function.
  • The AgentSession(...) you rebuilt in every session function.
  • The manual session.start(...) and greeting call.
  • agents.cli.run_app(server), replaced by pool.run().
  • The one-process-per-agent deployment. All your agents live in one worker.

What you gain

The things livekit-agents alone leaves to you:

Density

50+ concurrent sessions per worker as asyncio tasks, sharing one prewarm, instead of a subprocess (~3 GB) per session.

Hot reload

Edit an agent and live calls pick up the new code on their next turn, with no dropped audio. A bad save rolls back.

Per-tenant isolation

Per-tenant provider keys, session caps, and a blast-radius circuit breaker, all in one pool.

Zero-downtime deploys

Blue-green drain: new calls hit the new version, in-flight calls finish on the old one. No dropped calls.

Concept map

livekit-agentsOpenRTC
AgentServer() + one @server.rtc_session(agent_name=...) per agentone AgentPool(), then pool.add(name, Agent) per agent
AgentSession(stt=, llm=, tts=, turn_handling=) rebuilt in each sessionbuilt for you from pool defaults plus per-agent overrides
prewarm reloaded in every worker processshared prewarm (VAD, turn detector) loaded once per worker
one worker process per agent to scalemany agents and 50+ sessions in one worker (coroutine mode)
agents.cli.run_app(server)pool.run(), which wraps the same CLI
dispatch keyed on agent_namerouting chain: job metadata, room metadata, room-name prefix
restart the worker to ship a changehot reload swaps live sessions on their next turn

Next steps

Quickstart

Install OpenRTC and run your first pool.

How it works

The universal entrypoint, coroutine density, and shared prewarm under the hood.
Need hard crash isolation instead of density? Pass isolation="process" for the one-subprocess-per-session model, the same as livekit-agents, with per-session memory caps. Everything else on this page still applies.