Use this file to discover all available pages before exploring further.
Command queue
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
Why
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
How it works
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
- enqueues by session key (lane ) to guarantee only one active run per session.
- Each session run is then queued into a global lane ( by default) so overall parallelism is capped by
agents.defaults.maxConcurrent
.
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
Defaults
When unset, all inbound channel surfaces use:
is the default because it keeps the active model turn responsive without
starting a second session run. It drains all steering messages that arrived
before the next model boundary. If the current run cannot accept steering,
OpenClaw falls back to a followup queue entry.
Queue modes
Inbound messages can steer the current run, wait for a followup turn, or do both:
- : queue steering messages into the active runtime. Pi delivers all pending steering messages after the current assistant turn finishes executing its tool calls, before the next LLM call; Codex app-server receives one batched . If the run is not actively streaming or steering is unavailable, OpenClaw falls back to a followup queue entry.
- (legacy): old one-at-a-time steering. Pi delivers one queued steering message at each model boundary; Codex app-server receives separate requests. Prefer unless you need the previous serialized behavior.
- : enqueue each message for a later agent turn after the current run ends.
- : coalesce queued messages into a single followup turn after the quiet window. If messages target different channels/threads, they drain individually to preserve routing.
- (aka ): steer now and preserve the same message for a followup turn.
- (legacy): abort the active run for that session, then run the newest message.
Steer-backlog means you can get a followup response after the steered run, so
streaming surfaces can look like duplicates. Prefer
/
if you want
one response per inbound message.
For runtime-specific timing and dependency behavior, see
Steering queue.
Configure globally or per channel via
:
{
messages: {
queue: {
mode: "steer",
debounceMs: 500,
cap: 20,
drop: "summarize",
byChannel: { discord: "collect" },
},
},
}
Queue options
Options apply to
,
, and
(and to
or legacy
when steering falls back to followup):
- : quiet window before draining queued followups. Bare numbers are milliseconds; units , , , , and are accepted by options.
- : max queued messages per session. Values below are ignored.
- : default. Drop the oldest queued entries as needed, keep compact summaries, and inject them as a synthetic followup prompt.
- : drop the oldest queued entries as needed, without preserving summaries.
- : reject the newest message when the queue is already full.
Defaults:
,
,
.
Precedence
For mode selection, OpenClaw resolves:
- Inline or stored per-session override.
messages.queue.byChannel.<channel>
.
- .
- Default .
For options, inline or stored
options win over config. Then
channel-specific debounce (
messages.queue.debounceMsByChannel
), plugin
debounce defaults, global
options, and built-in defaults are
applied.
and
are global/session options, not per-channel config
keys.
Per-session overrides
- Send as a standalone command to store the mode for the current session.
- Options can be combined:
/queue collect debounce:0.5s cap:25 drop:summarize
- or clears the session override.
Scope and guarantees
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
- Default lane () is process-wide for inbound + main heartbeats; set
agents.defaults.maxConcurrent
to allow multiple sessions in parallel.
- Additional lanes may exist (e.g. , , , ) so background jobs can run in parallel without blocking inbound replies. Isolated cron agent turns hold a slot while their inner agent execution uses ; both use . Shared non-cron flows keep their own lane behavior. These detached runs are tracked as background tasks.
- Per-session lanes guarantee that only one agent run touches a given session at a time.
- No external dependencies or background worker threads; pure TypeScript + promises.
Troubleshooting
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- If you need queue depth, enable verbose logs and watch for queue timing lines.
- Codex app-server runs that accept a turn and then stop emitting progress are interrupted by the Codex adapter so the active session lane can release instead of waiting for the outer run timeout.
- When diagnostics are enabled, sessions that remain in past
diagnostics.stuckSessionWarnMs
log a stuck-session warning. Active embedded runs, active reply operations, and active lane tasks remain warning-only by default; stale startup bookkeeping with no active session work can release the affected session lane so queued work drains.
Related