Debugging Density
Running many sessions in one worker is efficient until one session misbehaves and you cannot tell which. This runbook is the flow for “my pool feels slow” or “one session is hot”, usingopenrtc top and the slow-session detector.
It assumes coroutine mode with introspection on (the default).
1. Look at the pool
Open the live inspector next to your worker:- One
slowrow. A session is blocking the shared event loop. Jump to section 3. - One row with a high, sustained
cpu%. A session is CPU-heavy (a tight loop, a large sync transform). See section 4. peakclimbing across the board,mem(MB)trending up. Worker-level memory pressure. See section 5.- Nothing stands out but latency is bad. The bottleneck is likely in the voice pipeline (STT/LLM/TTS), which OpenRTC does not measure. See section 6.
2. Confirm it is density, not a single wedged call
Presss to sort by duration_s. A single very long-lived session that should
have ended can look like load. If a call is stuck, that is a session bug (an
await that never resolves), not a density problem. Fix it in the agent.
3. A session is blocking the loop
Aslow status means the detector measured the event loop stalling while that
session’s task was on-CPU. Check the worker logs for the attribution line:
time.sleep, a blocking file or DB call, or a heavy
CPU section run inline. The fix is always the same shape (get it off the loop):
- Use the async client (
aiohttp/httpx.AsyncClient) instead of a sync one. - Wrap unavoidable blocking work in
await asyncio.to_thread(...). - Break large CPU loops into chunks that
await asyncio.sleep(0)between them.
slow_session_threshold_ms (default 50 ms) to catch smaller stalls while
hunting, e.g. AgentPool(slow_session_threshold_ms=20).
The detector reports the session and duration, not the exact source line (stack
sampling is deferred). Use the session id to find the call in that agent’s code;
a block over ~50 ms in an async agent is almost always one sync call.
4. A session is CPU-hot
A high, sustainedcpu% without a slow status is a session doing real work
that is not (yet) blocking the loop, but it still competes for the one core.
Remember cpu% is a sampled share, not exact seconds: use it to rank
sessions, not to bill them. If one agent is consistently hot, move its heavy work
to asyncio.to_thread, or run that agent under isolation="process" so it gets
its own core and cannot starve the others.
5. The worker is trending toward its memory limit
mem(MB) is an equal share of process RSS, so per-session numbers sum back to
the real RSS (watch the trend, not any single row). If the total is climbing
toward your memory_limit_mb, the worker will drain and restart when it
crosses the limit (coroutine caps are worker-level, not per-session). To find a
leak, restart with isolation="process" temporarily so the OS accounts memory
per session, or reduce max_concurrent_sessions to lower peak pressure.
6. Nothing in OpenRTC explains it
If sessions look healthy inopenrtc top but calls are still slow, the
bottleneck is in the voice pipeline (STT, LLM, or TTS latency), which
OpenRTC does not see (it sees coroutines, not providers). That is
voicegateway’s lane: cost, provider latency, and quality metrics live there,
keyed by the agent_name and metadata["tenant"] OpenRTC emits. Look there for
per-provider latency, not here.
Quick reference
| Symptom | First move | Likely fix |
|---|---|---|
One slow row | Read the [slow-session] log line | Move the sync call off the loop |
High cpu%, not slow | openrtc top --sort cpu_pct | to_thread or isolation="process" |
mem(MB) climbing | Watch the RSS trend | Lower concurrency / find the leak in process mode |
| Healthy table, slow calls | (not an OpenRTC issue) | Pipeline latency: voicegateway |
