Architecture
OpenRTC keeps the public API intentionally narrow.
Core building blocks
AgentConfig
AgentConfig stores the registration-time settings for a LiveKit agent:
- unique
name agent_clssubclass- optional
stt,llm, andttsvalues (ProviderValue | None: provider ID strings or plugin instances) - optional
greetinggenerated afterctx.connect() - optional
session_kwargsforwarded toAgentSession - optional
source_pathwhen the module file is known (e.g. after discovery), for tooling and footprint estimates—not used for routing
AgentDiscoveryConfig
AgentDiscoveryConfig stores optional discovery metadata attached by @agent_config(...):
- optional explicit
name - optional
stt,llm, andttsoverrides - optional
greetingoverride
AgentPool
AgentPool owns a single LiveKit AgentServer, a registry of named agents, and one universal session handler.
At startup it configures shared prewarm behavior so worker-level runtime assets are loaded once and reused across sessions.
The pool picks the underlying server class from the isolation constructor argument:
isolation="coroutine"(the v0.1 default) constructs an internal_CoroutineAgentServersubclass that swapslivekit.agents.ipc.proc_pool.ProcPoolfor ourCoroutinePoolfor the duration ofrun().isolation="process"constructs the vanillaAgentServerfromlivekit-agents(one OS subprocess per session, the v0.0.x behavior).
The same agent classes, providers, and routing rules apply in both modes.
Session lifecycle
When a room is assigned to the worker:
- OpenRTC resolves the target agent from job metadata, room metadata, room-name prefix matching, or the first registered agent.
- It creates an
AgentSessionusing the selected agent configuration. - Prewarmed VAD and turn detection models are injected from
proc.userdata. - The resolved agent instance is started for the room.
- OpenRTC connects the room context.
- If a greeting is configured, it generates the greeting after connect.
Coroutine-mode lifecycle
When isolation="coroutine" (the v0.1 default), the per-job lifecycle runs inside the worker process instead of in a forked subprocess. The high-level flow is:
AgentServer.run()
│
first time, builds CoroutinePool (one per worker)
│
CoroutinePool.start()
│
┌─── runs the user's setup_fnc ONCE ───┐
│ into a singleton JobProcess │
│ (loads VAD, turn detector, …) │
└──────────────────────────────────────┘
│
worker is registered
and accepts dispatch
│
▼
per session (N concurrent):
│
CoroutinePool.launch_job(info)
│
builds a CoroutineJobExecutor wired with
the same setup_fnc + entrypoint_fnc the pool was
constructed with, plus a context_factory closing
over the singleton JobProcess
│
executor.launch_job(info)
│
schedules `_run_entrypoint(ctx)` as
an asyncio.Task on the running loop
│
▼
user entrypoint runs (AgentSession etc.)
│
wrapper catches any exception, sets status
to FAILED, calls session_end_fnc, removes the
executor from pool.processes; supervisor counts
consecutive failures
│
▼
on shutdown: pool.drain() awaits every
in-flight executor's join(); pool.aclose()
cancels anything still pendingKey invariants in coroutine mode:
- Setup runs once per worker. The user's prewarm callback (Silero, turn detector, etc.) is invoked exactly once into the singleton
JobProcess, then every executor'sJobContextreferences that same process anduserdatadict. This is the density story: prewarm cost is amortized across N concurrent sessions instead of paid once per session as in process mode. - One executor, one session. Every
launch_joballocates a freshCoroutineJobExecutor; concurrent sessions never share an executor. Errors stay isolated to their executor's task wrapper. - No subprocess. Per-session work runs as
asyncio.Tasks on the worker loop. There is no IPC, no process boundary, no per-session process startup cost. - Cooperative backpressure.
CoroutinePool.current_load()returnslen(active) / max_concurrent_sessions. The_CoroutineAgentServerregisters aload_fncclosure that reads this value, so LiveKit dispatch sees>= 1.0at saturation and routes new jobs elsewhere. - Cooperative shutdown.
drain()flips a flag (rejecting new launches) and awaits every executor'sjoin();aclose()then cancels anything still pending and clears state. After both, the worker's asyncio loop has no residual tasks belonging to the pool. UpstreamAgentServerinstalls the SIGTERM/SIGINT handler and invokesaclose()when one fires; the wait window is bounded byAgentPool(drain_timeout=N)(default 30 seconds). Sessions that exceed the budget are cancelled with aWARNINGlog and the per-executorkill()escalation runs so the worker can finish shutting down. - Supervisor. After
consecutive_failure_limit(default 5) consecutive non-SUCCESS terminations, the pool fires its registered callback. The default callback in_CoroutineAgentServerschedulesaclose()so the worker exits and the deployment platform restarts it — the blast radius of a systemic bug stays bounded.
In process mode, the per-session lifecycle is unchanged from v0.0.x: each session is its own subprocess via livekit-agents's default ProcPool, with its own JobProcess, its own setup_fnc invocation, and its own rtc.Room.
Configuration precedence
Worker-runtime settings (isolation, max_concurrent_sessions) can be supplied at three layers, in order of priority:
- CLI flag —
--isolation,--max-concurrent-sessions. Passed on theopenrtc start/dev/consolecommand line, this wins over everything else. - Environment variable —
OPENRTC_ISOLATION,OPENRTC_MAX_CONCURRENT_SESSIONS. Read at startup when the matching flag is not provided. - Library default —
isolation="coroutine",max_concurrent_sessions=50. Baked intoAgentPool.__init__when neither the flag nor the env var is set.
The same precedence applies to the LiveKit connection settings the CLI exposes (--url / LIVEKIT_URL, --api-key / LIVEKIT_API_KEY, --api-secret / LIVEKIT_API_SECRET, --log-level / LIVEKIT_LOG_LEVEL); those follow the upstream livekit-agents naming convention.
Shared runtime dependencies
During prewarm, OpenRTC loads:
livekit.plugins.silerolivekit.plugins.turn_detector.multilingual.MultilingualModel
These plugins are expected to be available from the package installation. If they are missing at runtime, OpenRTC raises a RuntimeError with install instructions.
Why this shape?
This design keeps the package easy to reason about:
- routing logic is explicit
- worker-scoped dependencies are loaded once
- discovery metadata is opt-in and typed
- agent registration stays stable and readable
- the public API remains small enough for contributors to extend safely
