: Kills any lingering gateway process holding the default control port, then runs the full Vitest suite with an isolated gateway port so server tests don’t collide with a running instance. Use this when a prior gateway run left port 18789 occupied.
text
pnpm test:coverage
: Runs the unit suite with V8 coverage (via
text
vitest.unit.config.ts
). This is a loaded-file unit coverage gate, not whole-repo all-file coverage. Thresholds are 70% lines/functions/statements and 55% branches. Because
text
coverage.all
is false, the gate measures files loaded by the unit coverage suite instead of treating every split-lane source file as uncovered.
text
pnpm test:coverage:changed
: Runs unit coverage only for files changed since
text
origin/main
.
text
pnpm test:changed
: cheap smart changed test run. It runs precise targets from direct test edits, sibling
text
*.test.ts
files, explicit source mappings, and the local import graph. Broad/config/package changes are skipped unless they map to precise tests.
text
OPENCLAW_TEST_CHANGED_BROAD=1 pnpm test:changed
: explicit broad changed test run. Use it when a test harness/config/package edit should fall back to Vitest's broader changed-test behavior.
text
pnpm changed:lanes
: shows the architectural lanes triggered by the diff against
text
origin/main
.
text
pnpm check:changed
: runs the smart changed check gate for the diff against
text
origin/main
. It runs typecheck, lint, and guard commands for the affected architectural lanes, but does not run Vitest tests. Use
text
pnpm test:changed
or explicit
text
pnpm test <target>
for test proof.
text
pnpm test
: routes explicit file/directory targets through scoped Vitest lanes. Untargeted runs use fixed shard groups and expand to leaf configs for local parallel execution; the extension group always expands to the per-extension shard configs instead of one giant root-project process.
Test wrapper runs end with a short
text
[test] passed|failed|skipped ... in ...
summary. Vitest's own duration line stays the per-shard detail.
Shared OpenClaw test state: use
text
src/test-utils/openclaw-test-state.ts
from Vitest when a test needs an isolated
text
HOME
,
text
OPENCLAW_STATE_DIR
,
text
OPENCLAW_CONFIG_PATH
, config fixture, workspace, agent dir, or auth-profile store.
Process E2E helpers: use
text
test/helpers/openclaw-test-instance.ts
when a Vitest process-level E2E test needs a running Gateway, CLI env, log capture, and cleanup in one place.
as a Node flag. Docker/Bash lanes that launch a Gateway can source
text
scripts/lib/openclaw-e2e-instance.sh
inside the container for entrypoint resolution, mock OpenAI startup, Gateway foreground/background launch, readiness probes, state env export, log dumps, and process cleanup.
Full, extension, and include-pattern shard runs update local timing data in
text
.artifacts/vitest-shard-timings.json
; later whole-config runs use those timings to balance slow and fast shards. Include-pattern CI shards append the shard name to the timing key, which keeps filtered shard timings visible without replacing whole-config timing data. Set
text
OPENCLAW_TEST_PROJECTS_TIMINGS=0
to ignore the local timing artifact.
Selected
text
plugin-sdk
and
text
commands
test files now route through dedicated light lanes that keep only
text
test/setup.ts
, leaving runtime-heavy cases on their existing lanes.
Source files with sibling tests map to that sibling before falling back to wider directory globs. Helper edits under
text
src/channels/plugins/contracts/test-helpers
,
text
src/plugin-sdk/test-helpers
, and
text
src/plugins/contracts
use a local import graph to run importing tests instead of broad-running every shard when the dependency path is precise.
text
auto-reply
now also splits into three dedicated configs (
text
core
,
text
top-level
,
text
reply
) so the reply harness does not dominate the lighter top-level status/token/helper tests.
Base Vitest config now defaults to
text
pool: "threads"
and
text
isolate: false
, with the shared non-isolated runner enabled across the repo configs.
text
pnpm test:channels
runs
text
vitest.channels.config.ts
.
text
pnpm test:extensions
and
text
pnpm test extensions
run all extension/plugin shards. Heavy channel plugins, the browser plugin, and OpenAI run as dedicated shards; other plugin groups stay batched. Use
text
pnpm test extensions/<id>
for one bundled plugin lane.
text
pnpm test:perf:imports
: enables Vitest import-duration + import-breakdown reporting, while still using scoped lane routing for explicit file/directory targets.
text
pnpm test:perf:imports:changed
: same import profiling, but only for files changed since
text
origin/main
.
text
pnpm test:perf:changed:bench -- --ref <git-ref>
benchmarks the routed changed-mode path against the native root-project run for the same committed git diff.
text
pnpm test:perf:changed:bench -- --worktree
benchmarks the current worktree change set without committing first.
text
pnpm test:perf:profile:main
: writes a CPU profile for the Vitest main thread (
text
.artifacts/vitest-main-profile
).
text
pnpm test:perf:profile:runner
: writes CPU + heap profiles for the unit runner (
: runs every full-suite Vitest leaf config serially and writes grouped duration data plus per-config JSON/log artifacts. The Test Performance Agent uses this as its baseline before attempting slow-test fixes.
: Runs provider live tests (minimax/zai). Requires API keys and
text
LIVE=1
(or provider-specific
text
*_LIVE_TEST=1
) to unskip.
text
pnpm test:docker:all
: Builds the shared live-test image, packs OpenClaw once as an npm tarball, builds/reuses a bare Node/Git runner image plus a functional image that installs that tarball into
text
/app
, then runs Docker smoke lanes with
text
OPENCLAW_SKIP_DOCKER_BUILD=1
through a weighted scheduler. The bare image (
text
OPENCLAW_DOCKER_E2E_BARE_IMAGE
) is used for installer/update/plugin-dependency lanes; those lanes mount the prebuilt tarball instead of using copied repo sources. The functional image (
text
OPENCLAW_DOCKER_E2E_FUNCTIONAL_IMAGE
) is used for normal built-app functionality lanes.
text
scripts/package-openclaw-for-docker.mjs
is the single local/CI package packer and validates the tarball plus
text
dist/postinstall-inventory.json
before Docker consumes it. Docker lane definitions live in
text
scripts/lib/docker-e2e-scenarios.mjs
; planner logic lives in
text
scripts/lib/docker-e2e-plan.mjs
;
text
scripts/test-docker-all.mjs
executes the selected plan.
text
node scripts/test-docker-all.mjs --plan-json
emits the scheduler-owned CI plan for selected lanes, image kinds, package/live-image needs, state scenarios, and credential checks without building or running Docker.
text
OPENCLAW_DOCKER_ALL_PARALLELISM=<n>
controls process slots and defaults to 10;
text
OPENCLAW_DOCKER_ALL_TAIL_PARALLELISM=<n>
controls the provider-sensitive tail pool and defaults to 10. Heavy lane caps default to
text
OPENCLAW_DOCKER_ALL_LIVE_LIMIT=9
,
text
OPENCLAW_DOCKER_ALL_NPM_LIMIT=10
, and
text
OPENCLAW_DOCKER_ALL_SERVICE_LIMIT=7
; provider caps default to one heavy lane per provider via
text
OPENCLAW_DOCKER_ALL_LIVE_CLAUDE_LIMIT=4
,
text
OPENCLAW_DOCKER_ALL_LIVE_CODEX_LIMIT=4
, and
text
OPENCLAW_DOCKER_ALL_LIVE_GEMINI_LIMIT=4
. Use
text
OPENCLAW_DOCKER_ALL_WEIGHT_LIMIT
or
text
OPENCLAW_DOCKER_ALL_DOCKER_LIMIT
for larger hosts. If one lane exceeds the effective weight or resource cap on a low-parallelism host, it can still start from an empty pool and will run alone until it releases capacity. Lane starts are staggered by 2 seconds by default to avoid local Docker daemon create storms; override with
text
OPENCLAW_DOCKER_ALL_START_STAGGER_MS=<ms>
. The runner preflights Docker by default, cleans stale OpenClaw E2E containers, emits active-lane status every 30 seconds, shares provider CLI tool caches between compatible lanes, retries transient live-provider failures once by default (
text
OPENCLAW_DOCKER_ALL_LIVE_RETRIES=<n>
), and stores lane timings in
text
.artifacts/docker-tests/lane-timings.json
for longest-first ordering on later runs. Use
text
OPENCLAW_DOCKER_ALL_DRY_RUN=1
to print the lane manifest without running Docker,
text
OPENCLAW_DOCKER_ALL_STATUS_INTERVAL_MS=<ms>
to tune status output, or
text
OPENCLAW_DOCKER_ALL_TIMINGS=0
to disable timing reuse. Use
text
OPENCLAW_DOCKER_ALL_LIVE_MODE=skip
for deterministic/local lanes only or
text
OPENCLAW_DOCKER_ALL_LIVE_MODE=only
for live-provider lanes only; package aliases are
text
pnpm test:docker:local:all
and
text
pnpm test:docker:live:all
. Live-only mode merges main and tail live lanes into one longest-first pool so provider buckets can pack Claude, Codex, and Gemini work together. The runner stops scheduling new pooled lanes after the first failure unless
text
OPENCLAW_DOCKER_ALL_FAIL_FAST=0
is set, and each lane has a 120-minute fallback timeout overrideable with
text
OPENCLAW_DOCKER_ALL_LANE_TIMEOUT_MS
; selected live/tail lanes use tighter per-lane caps. CLI backend Docker setup commands have their own timeout via
: Builds a Chromium-backed source E2E container, starts raw CDP plus an isolated Gateway, runs
text
browser doctor --deep
, and verifies CDP role snapshots include link URLs, cursor-promoted clickables, iframe refs, and frame metadata.
CLI backend live Docker probes can be run as focused lanes, for example
text
pnpm test:docker:live-cli-backend:codex
,
text
pnpm test:docker:live-cli-backend:codex:resume
, or
text
pnpm test:docker:live-cli-backend:codex:mcp
. Claude and Gemini have matching
text
:resume
and
text
:mcp
aliases.
text
pnpm test:docker:openwebui
: Starts Dockerized OpenClaw + Open WebUI, signs in through Open WebUI, checks
text
/api/models
, then runs a real proxied chat through
text
/api/chat/completions
. Requires a usable live model key (for example OpenAI in
text
~/.profile
), pulls an external Open WebUI image, and is not expected to be CI-stable like the normal unit/e2e suites.
text
pnpm test:docker:mcp-channels
: Starts a seeded Gateway container and a second client container that spawns
text
openclaw mcp serve
, then verifies routed conversation discovery, transcript reads, attachment metadata, live event queue behavior, outbound send routing, and Claude-style channel + permission notifications over the real stdio bridge. The Claude notification assertion reads the raw stdio MCP frames directly so the smoke reflects what the bridge actually emits.
text
pnpm test:docker:upgrade-survivor
: Installs the packed OpenClaw tarball over a dirty old-user fixture, runs package update plus non-interactive doctor without live provider or channel keys, then starts a loopback Gateway and checks that agents, channel config, plugin allowlists, workspace/session files, stale plugin runtime-deps state, startup, and RPC status survive.
text
pnpm test:docker:published-upgrade-survivor
: Installs
text
openclaw@latest
by default, seeds realistic existing-user files without live provider or channel keys, configures that baseline with a baked
text
openclaw config set
command recipe, updates that published install to the packed OpenClaw tarball, runs non-interactive doctor, writes
text
.artifacts/upgrade-survivor/summary.json
, then starts a loopback Gateway and checks that configured intents, workspace/session files, stale plugin config/runtime-deps state, startup, and RPC status survive or repair cleanly. Override the baseline with
text
OPENCLAW_UPGRADE_SURVIVOR_BASELINE_SPEC
; Package Acceptance exposes the same value as
text
published_upgrade_survivor_baseline
.
Local PR gate
For local PR land/gate checks, run:
text
pnpm check:changed
text
pnpm check
text
pnpm check:test-types
text
pnpm build
text
pnpm test
text
pnpm check:docs
If
text
pnpm test
flakes on a loaded host, rerun once before treating it as a regression, then isolate with