Caricamento in corso...
Caricamento in corso...
Last synced: Today, 22:00
Technical reference for the OpenClaw framework. Real-time synchronization with the official documentation engine.
Use this file to discover all available pages before exploring further.
Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. OpenClaw normalizes provider usage into
cacheReadcacheWriteStatus surfaces can also recover cache counters from the most recent transcript usage log when the live session snapshot is missing them, so
/statusWhy this matters: lower token cost, faster responses, and more predictable performance for long-running sessions. Without caching, repeated prompts pay the full prompt cost on every turn even when most input did not change.
The sections below cover every cache-related knob that affects prompt reuse and token cost.
Provider references:
cacheRetentionSet cache retention as a global default for all models:
yamlagents: defaults: params: cacheRetention: "long" # none | short | long
Override per-model:
yamlagents: defaults: models: "anthropic/claude-opus-4-6": params: cacheRetention: "short" # none | short | long
Per-agent override:
yamlagents: list: - id: "alerts" params: cacheRetention: "none"
Config merge order:
agents.defaults.paramsagents.defaults.models["provider/model"].paramsagents.list[].paramscontextPruning.mode: "cache-ttl"Prunes old tool-result context after cache TTL windows so post-idle requests do not re-cache oversized history.
yamlagents: defaults: contextPruning: mode: "cache-ttl" ttl: "1h"
See Session Pruning for full behavior.
Heartbeat can keep cache windows warm and reduce repeated cache writes after idle gaps.
yamlagents: defaults: heartbeat: every: "55m"
Per-agent heartbeat is supported at
agents.list[].heartbeatcacheRetentioncacheRetention: "short"cache_read_input_tokenscache_creation_input_tokenscacheReadcacheWritecacheRetention: "short"cacheRetention: "long"api.anthropic.comprompt_cache_keyprompt_cache_retention: "24h"cacheRetention: "long"prompt_cache_keycompat.supportsPromptCacheKey: truecacheRetention: "none"usage.prompt_tokens_details.cached_tokensinput_tokens_details.cached_tokenscacheReadcacheWrite0x-request-idopenai-processing-msx-ratelimit-*48644608anthropic-vertex/*cacheRetentioncacheRetention: "long"anthropic-vertexamazon-bedrock/*anthropic.claude*cacheRetentioncacheRetention: "none"For
openrouter/anthropic/*cache_controlopenrouteropenrouter.aiFor
openrouter/deepseek/*openrouter/moonshot*/*openrouter/zai/*contextPruning.mode: "cache-ttl"cache_controlDeepSeek cache construction is best-effort and can take a few seconds. An immediate follow-up may still show
cached_tokens: 0usage.prompt_tokens_details.cached_tokensIf you repoint the model at an arbitrary OpenAI-compatible proxy URL, OpenClaw stops injecting those OpenRouter-specific Anthropic cache markers.
If the provider does not support this cache mode,
cacheRetentionapi: "google-generative-ai"cachedContentTokenCountcacheReadcacheRetentioncachedContentsparams.cachedContentparams.cached_contentcachedContentsstats.cachedcacheReadstats.inputstats.input_tokens - stats.cachedOpenClaw splits the system prompt into a stable prefix and a volatile suffix separated by an internal cache-prefix boundary. Content above the boundary (tool definitions, skills metadata, workspace files, and other relatively static context) is ordered so it stays byte-identical across turns. Content below the boundary (for example
HEARTBEAT.mdKey design choices:
HEARTBEAT.mdIf you see unexpected
cacheWriteOpenClaw also keeps several cache-sensitive payload shapes deterministic before the request reaches the provider:
listTools()Keep a long-lived baseline on your main agent, disable caching on bursty notifier agents:
yamlagents: defaults: model: primary: "anthropic/claude-opus-4-6" models: "anthropic/claude-opus-4-6": params: cacheRetention: "long" list: - id: "research" default: true heartbeat: every: "55m" - id: "alerts" params: cacheRetention: "none"
cacheRetention: "short"contextPruning.mode: "cache-ttl"OpenClaw exposes dedicated cache-trace diagnostics for embedded agent runs.
For normal user-facing diagnostics,
/statuscacheReadcacheWriteOpenClaw keeps one combined live cache regression gate for repeated prefixes, tool turns, image turns, MCP-style tool transcripts, and an Anthropic no-cache control.
src/agents/live-cache-regression.live.test.tssrc/agents/live-cache-regression-baseline.tsRun the narrow live gate with:
shOPENCLAW_LIVE_TEST=1 OPENCLAW_LIVE_CACHE_TEST=1 pnpm test:live:cache
The baseline file stores the most recent observed live numbers plus the provider-specific regression floors used by the test. The runner also uses fresh per-run session IDs and prompt namespaces so previous cache state does not pollute the current regression sample.
These tests intentionally do not use identical success criteria across providers.
cacheWritecacheReadcacheWrite0gpt-5.4-minicacheRead >= 4608>= 0.90cacheRead >= 4096>= 0.85cacheRead >= 3840>= 0.82cacheRead >= 4096>= 0.85Fresh combined live verification on 2026-04-04 landed at:
cacheRead=48640.966cacheRead=46080.896cacheRead=48640.954cacheRead=46080.891Recent local wall-clock time for the combined gate was about
88sWhy the assertions differ:
diagnostics.cacheTraceyamldiagnostics: cacheTrace: enabled: true filePath: "~/.openclaw/logs/cache-trace.jsonl" # optional includeMessages: false # default true includePrompt: false # default true includeSystem: false # default true
Defaults:
filePath$OPENCLAW_STATE_DIR/logs/cache-trace.jsonlincludeMessagestrueincludePrompttrueincludeSystemtrueOPENCLAW_CACHE_TRACE=1OPENCLAW_CACHE_TRACE_FILE=/path/to/cache-trace.jsonlOPENCLAW_CACHE_TRACE_MESSAGES=0|1OPENCLAW_CACHE_TRACE_PROMPT=0|1OPENCLAW_CACHE_TRACE_SYSTEM=0|1session:loadedprompt:beforestream:contextsession:aftercacheReadcacheWrite/usage fullcacheReadcacheWritecacheReadcacheWrite0cacheWritecacheWritecacheReadprompt_cache_keycacheRetentionagents.defaults.models["provider/model"]noneRelated docs:
© 2024 TaskFlow Mirror
Powered by TaskFlow Sync Engine