Use this file to discover all available pages before exploring further.

Browser control API

For setup, configuration, and troubleshooting, see Browser. This page is the reference for the local control HTTP API, the

text

openclaw browser

CLI, and scripting patterns (snapshots, refs, waits, debug flows).

Control API (optional)

For local integrations only, the Gateway exposes a small loopback HTTP API:

Status/start/stop:
text
GET /
,
text
POST /start
,
text
POST /stop
Tabs:
text
GET /tabs
,
text
POST /tabs/open
,
text
POST /tabs/focus
,
text
DELETE /tabs/:targetId
Snapshot/screenshot:
text
GET /snapshot
,
text
POST /screenshot
Actions:
text
POST /navigate
,
text
POST /act
Hooks:
text
POST /hooks/file-chooser
,
text
POST /hooks/dialog
Downloads:
text
POST /download
,
text
POST /wait/download
Permissions:
text
POST /permissions/grant
Debugging:
text
GET /console
,
text
POST /pdf
Debugging:
text
GET /errors
,
text
GET /requests
,
text
POST /trace/start
,
text
POST /trace/stop
,
text
POST /highlight
Network:
text
POST /response/body
State:
text
GET /cookies
,
text
POST /cookies/set
,
text
POST /cookies/clear
State:
text
GET /storage/:kind
,
text
POST /storage/:kind/set
,
text
POST /storage/:kind/clear
Settings:
text
POST /set/offline
,
text
POST /set/headers
,
text
POST /set/credentials
,
text
POST /set/geolocation
,
text
POST /set/media
,
text
POST /set/timezone
,
text
POST /set/locale
,
text
POST /set/device

All endpoints accept

text

?profile=<name>

text

POST /start?headless=true

requests a one-shot headless launch for local managed profiles without changing persisted browser config; attach-only, remote CDP, and existing-session profiles reject that override because OpenClaw does not launch those browser processes.

If shared-secret gateway auth is configured, browser HTTP routes require auth too:

text
Authorization: Bearer <gateway token>
text
x-openclaw-password: <gateway password>
or HTTP Basic auth with that password

Notes:

This standalone loopback browser API does not consume trusted-proxy or Tailscale Serve identity headers.
If
text
gateway.auth.mode
is
text
none
or
text
trusted-proxy
, these loopback browser routes do not inherit those identity-bearing modes; keep them loopback-only.

text
`/act`
error contract

text

POST /act

uses a structured error response for route-level validation and policy failures:


json
{ "error": "<message>", "code": "ACT_*" }

Current

text

code

values:

text
ACT_KIND_REQUIRED
(HTTP 400):
text
kind
is missing or unrecognized.
text
ACT_INVALID_REQUEST
(HTTP 400): action payload failed normalization or validation.
text
ACT_SELECTOR_UNSUPPORTED
(HTTP 400):
text
selector
was used with an unsupported action kind.
text
ACT_EVALUATE_DISABLED
(HTTP 403):
text
evaluate
(or
text
wait --fn
) is disabled by config.
text
ACT_TARGET_ID_MISMATCH
(HTTP 403): top-level or batched
text
targetId
conflicts with request target.
text
ACT_EXISTING_SESSION_UNSUPPORTED
(HTTP 501): action is not supported for existing-session profiles.

Other runtime failures may still return

text

{ "error": "<message>" }

without a

text

code

field.

Playwright requirement

Some features (navigate/act/AI snapshot/role snapshot, element screenshots, PDF) require Playwright. If Playwright isn’t installed, those endpoints return a clear 501 error.

What still works without Playwright:

ARIA snapshots
Role-style accessibility snapshots (
text
--interactive
,
text
--compact
,
text
--depth
,
text
--efficient
) when a per-tab CDP WebSocket is available. This is a fallback for inspection and ref discovery; Playwright remains the primary action engine.
Page screenshots for the managed
text
openclaw
browser when a per-tab CDP WebSocket is available
Page screenshots for
text
existing-session
/ Chrome MCP profiles
text
existing-session
ref-based screenshots (
text
--ref
) from snapshot output

What still needs Playwright:

text
navigate
text
act
AI snapshots that depend on Playwright's native AI snapshot format
CSS-selector element screenshots (
text
--element
)
full browser PDF export

Element screenshots also reject

text

--full-page

; the route returns

text

fullPage is not supported for element screenshots

If you see

text

Playwright is not available in this gateway build

, repair the bundled browser plugin runtime dependencies so

text

playwright-core

is installed, then restart the gateway. For packaged installs, run

text

openclaw doctor --fix

. For Docker, also install the Chromium browser binaries as shown below.

Docker Playwright install

If your Gateway runs in Docker, avoid

text

npx playwright

(npm override conflicts). Use the bundled CLI instead:


bash
docker compose run --rm openclaw-cli \
  node /app/node_modules/playwright-core/cli.js install chromium

To persist browser downloads, set

text

PLAYWRIGHT_BROWSERS_PATH

(for example,

text

/home/node/.cache/ms-playwright

) and make sure

text

/home/node

is persisted via

text

OPENCLAW_HOME_VOLUME

or a bind mount. See Docker.

How it works (internal)

A small loopback control server accepts HTTP requests and connects to Chromium-based browsers via CDP. Advanced actions (click/type/snapshot/PDF) go through Playwright on top of CDP; when Playwright is missing, only non-Playwright operations are available. The agent sees one stable interface while local/remote browsers and profiles swap freely underneath.

CLI quick reference

All commands accept

text

--browser-profile <name>

to target a specific profile, and

text

--json

for machine-readable output.

Notes:

text
upload
and
text
dialog
are arming calls; run them before the click/press that triggers the chooser/dialog.
text
click
/
text
type
/etc require a
text
ref
from
text
snapshot
(numeric
text
12
, role ref
text
e12
, or actionable ARIA ref
text
ax12
). CSS selectors are intentionally not supported for actions. Use
text
click-coords
when the visible viewport position is the only reliable target.
Download, trace, and upload paths are constrained to OpenClaw temp roots:
text
/tmp/openclaw{,/downloads,/uploads}
(fallback:
text
${os.tmpdir()}/openclaw/...
).
text
upload
can also set file inputs directly via
text
--input-ref
or
text
--element
.

Stable tab ids and labels survive Chromium raw-target replacement when OpenClaw can prove the replacement tab, such as same URL or a single old tab becoming a single new tab after form submission. Raw target ids are still volatile; prefer

text

suggestedTargetId

from

text

tabs

in scripts.

Snapshot flags at a glance:

text
--format ai
(default with Playwright): AI snapshot with numeric refs (
text
aria-ref="<n>"
).
text
--format aria
: accessibility tree with
text
axN
refs. When Playwright is available, OpenClaw binds refs with backend DOM ids to the live page so follow-up actions can use them; otherwise treat the output as inspection-only.
text
--efficient
(or
text
--mode efficient
): compact role snapshot preset. Set
text
browser.snapshotDefaults.mode: "efficient"
to make this the default (see Gateway configuration).
text
--interactive
,
text
--compact
,
text
--depth
,
text
--selector
force a role snapshot with
text
ref=e12
refs.
text
--frame "<iframe>"
scopes role snapshots to an iframe.
text
--labels
adds a viewport-only screenshot with overlayed ref labels (prints
text
MEDIA:<path>
).
text
--urls
appends discovered link destinations to AI snapshots.

Snapshots and refs

OpenClaw supports two “snapshot” styles:

AI snapshot (numeric refs):
text
openclaw browser snapshot
(default;
text
--format ai
)
- Output: a text snapshot that includes numeric refs.
- Actions:
  text
  openclaw browser click 12
  ,
  text
  openclaw browser type 23 "hello"
  .
- Internally, the ref is resolved via Playwright’s
  text
  aria-ref
  .
Role snapshot (role refs like
text
e12
):
text
openclaw browser snapshot --interactive
(or
text
--compact
,
text
--depth
,
text
--selector
,
text
--frame
)
- Output: a role-based list/tree with
  text
  [ref=e12]
  (and optional
  text
  [nth=1]
  ).
- Actions:
  text
  openclaw browser click e12
  ,
  text
  openclaw browser highlight e12
  .
- Internally, the ref is resolved via
  text
  getByRole(...)
  (plus
  text
  nth()
  for duplicates).
- Add
  text
  --labels
  to include a viewport screenshot with overlayed
  text
  e12
  labels.
- Add
  text
  --urls
  when link text is ambiguous and the agent needs concrete navigation targets.
ARIA snapshot (ARIA refs like
text
ax12
):
text
openclaw browser snapshot --format aria
- Output: the accessibility tree as structured nodes.
- Actions:
  text
  openclaw browser click ax12
  works when the snapshot path can bind the ref through Playwright and Chrome backend DOM ids.
If Playwright is unavailable, ARIA snapshots can still be useful for inspection, but refs may not be actionable. Re-snapshot with
text
--format ai
or
text
--interactive
when you need action refs.
Docker proof for the raw-CDP fallback path:
text
pnpm test:docker:browser-cdp-snapshot
starts Chromium with CDP, runs
text
browser doctor --deep
, and verifies role snapshots include link URLs, cursor-promoted clickables, and iframe metadata.

Ref behavior:

Refs are not stable across navigations; if something fails, re-run
text
snapshot
and use a fresh ref.
text
/act
returns the current raw
text
targetId
after action-triggered replacement when it can prove the replacement tab. Keep using stable tab ids/labels for follow-up commands.
If the role snapshot was taken with
text
--frame
, role refs are scoped to that iframe until the next role snapshot.
Unknown or stale
text
axN
refs fail fast instead of falling through to Playwright's
text
aria-ref
selector. Run a fresh snapshot on the same tab when that happens.

Wait power-ups

You can wait on more than just time/text:

Wait for URL (globs supported by Playwright):
- text
  openclaw browser wait --url "**/dash"
Wait for load state:
- text
  openclaw browser wait --load networkidle
Wait for a JS predicate:
- text
  openclaw browser wait --fn "window.ready===true"
Wait for a selector to become visible:
- text
  openclaw browser wait "#main"

These can be combined:


bash
openclaw browser wait "#main" \
  --url "**/dash" \
  --load networkidle \
  --fn "window.ready===true" \
  --timeout-ms 15000

Debug workflows

When an action fails (e.g. “not visible”, “strict mode violation”, “covered”):

text
openclaw browser snapshot --interactive
Use
text
click <ref>
/
text
type <ref>
(prefer role refs in interactive mode)
If it still fails:
text
openclaw browser highlight <ref>
to see what Playwright is targeting
If the page behaves oddly:
- text
  openclaw browser errors --clear
- text
  openclaw browser requests --filter api --clear
For deep debugging: record a trace:
- text
  openclaw browser trace start
- reproduce the issue
- text
  openclaw browser trace stop
  (prints
  text
  TRACE:<path>
  )

JSON output

text

--json

is for scripting and structured tooling.

Examples:


bash
openclaw browser status --json
openclaw browser snapshot --interactive --json
openclaw browser requests --filter api --json
openclaw browser cookies --json

Role snapshots in JSON include

text

refs

plus a small

text

stats

block (lines/chars/refs/interactive) so tools can reason about payload size and density.

State and environment knobs

These are useful for “make the site behave like X” workflows:

Cookies:
text
cookies
,
text
cookies set
,
text
cookies clear
Storage:
text
storage local|session get|set|clear
Offline:
text
set offline on|off
Headers:
text
set headers --headers-json '{"X-Debug":"1"}'
(legacy
text
set headers --json '{"X-Debug":"1"}'
remains supported)
HTTP basic auth:
text
set credentials user pass
(or
text
--clear
)
Geolocation:
text
set geo <lat> <lon> --origin "https://example.com"
(or
text
--clear
)
Media:
text
set media dark|light|no-preference|none
Timezone / locale:
text
set timezone ...
,
text
set locale ...
Device / viewport:
- text
  set device "iPhone 14"
  (Playwright device presets)
- text
  set viewport 1280 720

Security and privacy

The openclaw browser profile may contain logged-in sessions; treat it as sensitive.
text
browser act kind=evaluate
/
text
openclaw browser evaluate
and
text
wait --fn
execute arbitrary JavaScript in the page context. Prompt injection can steer this. Disable it with
text
browser.evaluateEnabled=false
if you do not need it.
For logins and anti-bot notes (X/Twitter, etc.), see Browser login + X/Twitter posting.
Keep the Gateway/node host private (loopback or tailnet-only).
Remote CDP endpoints are powerful; tunnel and protect them.

Strict-mode example (block private/internal destinations by default):


json5
{
  browser: {
    ssrfPolicy: {
      dangerouslyAllowPrivateNetwork: false,
      hostnameAllowlist: ["*.example.com", "example.com"],
      allowedHostnames: ["localhost"], // optional exact allow
    },
  },
}

Browser — overview, configuration, profiles, security
Browser login — signing in to sites
Browser Linux troubleshooting
Browser WSL2 troubleshooting

OpenClaw Docs

Browser control API

Control API (optional)

text
`/act`
error contract

Playwright requirement

Docker Playwright install

How it works (internal)

CLI quick reference

Snapshots and refs

Wait power-ups

Debug workflows

JSON output

State and environment knobs

Security and privacy

Related

OpenClaw Docs

Browser control API

Control API (optional)

textCopy/act error contract

Playwright requirement

Docker Playwright install

How it works (internal)

CLI quick reference

Snapshots and refs

Wait power-ups

Debug workflows

JSON output

State and environment knobs

Security and privacy

Related

text
`/act`
error contract