Skip to main content

Migration and drain

“Zero-downtime upgrade” can mean two very different things:
  • Migration: pick up a live call from the old worker, move its state to the new worker, and resume it there. The caller never notices.
  • Drain: stop sending new calls to the old worker, let its live calls finish where they are, and route new calls to the new worker.
OpenRTC does drain. This page explains why migration is not viable for a live voice call, so the choice is understood rather than assumed.

What a live session is made of

A live AgentSession carries three kinds of state. The full inventory is in the worker state inventory; the summary is:
KindExamplesCan it move?
SerializableConversation history, agent identity, tenant, job metadataYes, it is plain data.
DerivableAgent class + instructions, provider config, prewarmed VADYes, rebuilt from config on the new worker.
LiveThe WebRTC transport, in-flight STT/LLM/TTS streams, turn state, open provider socketsNo. Bound to this process and this moment.
The serializable and derivable state is enough to reconstruct a fresh session. It is not enough to resume a live one.

Why the live state cannot move

The blocker is the live row. A voice call holds an open WebRTC transport to the caller and, at any instant, an in-flight STT, LLM, or TTS stream. You cannot serialize a token half-generated by the LLM or an audio buffer mid-synthesis and resume it in another process without dropping that turn. And the caller’s SDK is not built to be silently re-pointed at a new server mid-call: that would require a renegotiation the SDK does not expose. Moving a live call means a gap the caller hears, which is not zero-downtime for the person on the phone. So OpenRTC does not migrate. It drains: the call finishes on the worker it started on, uninterrupted, and only new calls land on the new version. See zero-downtime deployments for the mechanics.

What this means in practice

  • A live call is never paused, moved, or resumed elsewhere. It runs to its natural end on its original worker.
  • A new version affects only calls that start after it is deployed. To have new code affect an in-progress call, either wait for that call to end, or use hot reload, which swaps agent code within a running worker (a different mechanism, not a cross-worker move).
  • Draining is reused from the graceful-shutdown path, not a new subsystem: a worker draining for a deploy and a worker draining for SIGTERM do the same thing.

Could migration ever return?

Possibly, but for a different feature than live upgrades. A “pause this call and resume it later” feature would only ever move the serializable and derivable state (never the live streams), so it would necessarily involve a real gap the caller consents to, not a silent handoff. If that is ever built, the recorded format choice is msgpack (compact, Python-native types, versioned header). The migration serialize/deserialize APIs and the migration.* audit events are reserved for that future and are deliberately not emitted today. Until then: drain, do not migrate.