TaskFlow
DashboardFreewriteWhiteboardsProjectsCRMTasksNotificationsSettingsAgent TowerAPI Docs
OpenClaw Docs
?

User

Member

Caricamento in corso...

Home
Progetti
Task
Notifiche
CRM

    OpenClaw

    Documentation Mirror

    Documentation Overview

    Docs

    Auth credential semantics
    Scheduled tasks
    Hooks
    Automation & tasks
    Standing orders
    Task flow
    Background tasks
    BlueBubbles
    Broadcast groups
    Channel routing
    Discord
    Feishu
    Google Chat
    Group messages
    Groups
    iMessage
    Chat channels
    IRC
    LINE
    Channel location parsing
    Matrix
    Matrix migration
    Matrix push rules for quiet previews
    Mattermost
    Microsoft Teams
    Nextcloud Talk
    Nostr
    Pairing
    QA channel
    QQ bot
    Signal
    Slack
    Synology Chat
    Telegram
    Tlon
    Channel troubleshooting
    Twitch
    WeChat
    WhatsApp
    Yuanbao
    Zalo
    Zalo personal
    CI pipeline
    ACP
    Agent
    Agents
    Approvals
    Backup
    Browser
    Channels
    Clawbot
    `openclaw commitments`
    Completion
    Config
    Configure
    Cron
    Daemon
    Dashboard
    Devices
    Directory
    DNS
    Docs
    Doctor
    Flows (redirect)
    Gateway
    Health
    Hooks
    CLI reference
    Inference CLI
    Logs
    MCP
    Memory
    Message
    Migrate
    Models
    Node
    Nodes
    Onboard
    Pairing
    Plugins
    Proxy
    QR
    Reset
    Sandbox CLI
    Secrets
    Security
    Sessions
    Setup
    Skills
    Status
    System
    `openclaw tasks`
    TUI
    Uninstall
    Update
    Voicecall
    Webhooks
    Wiki
    Active memory
    Agent runtime
    Agent loop
    Agent runtimes
    Agent workspace
    Gateway architecture
    Channel docking
    Inferred commitments
    Compaction
    Context
    Context engine
    Delegate architecture
    Dreaming
    Experimental features
    Features
    Markdown formatting
    Memory overview
    Builtin memory engine
    Honcho memory
    QMD memory engine
    Memory search
    Messages
    Model failover
    Model providers
    Models CLI
    Multi-agent routing
    OAuth
    OpenClaw App SDK
    Presence
    QA overview
    Matrix QA
    Command queue
    Steering queue
    Retry policy
    Session management
    Session pruning
    Session tools
    SOUL.md personality guide
    Streaming and chunking
    System prompt
    Timezones
    TypeBox
    Typing indicators
    Usage tracking
    Date and time
    Node + tsx crash
    Diagnostics flags
    Authentication
    Background exec and process tool
    Bonjour discovery
    Bridge protocol
    CLI backends
    Configuration — agents
    Configuration — channels
    Configuration — tools and custom providers
    Configuration
    Configuration examples
    Configuration reference
    Diagnostics export
    Discovery and transports
    Doctor
    Gateway lock
    Health checks
    Heartbeat
    Gateway runbook
    Local models
    Gateway logging
    Multiple gateways
    Network model
    OpenAI chat completions
    OpenResponses API
    OpenShell
    OpenTelemetry export
    Gateway-owned pairing
    Prometheus metrics
    Gateway protocol
    Remote access
    Remote gateway setup
    Sandbox vs tool policy vs elevated
    Sandboxing
    Secrets management
    Secrets apply plan contract
    Security audit checks
    Security
    Tailscale
    Tools invoke API
    Troubleshooting
    Trusted proxy auth
    Debugging
    Environment variables
    FAQ
    FAQ: first-run setup
    FAQ: models and auth
    GPT-5.5 / Codex agentic parity
    GPT-5.5 / Codex parity maintainer notes
    Help
    Scripts
    Testing
    Testing: live suites
    General troubleshooting
    OpenClaw
    Ansible
    Azure
    Bun (experimental)
    ClawDock
    Release channels
    DigitalOcean
    Docker
    Docker VM runtime
    exe.dev
    Fly.io
    GCP
    Hetzner
    Hostinger
    Install
    Installer internals
    Kubernetes
    macOS VMs
    Migration guide
    Migrating from Claude
    Migrating from Hermes
    Nix
    Node.js
    Northflank
    Oracle Cloud
    Podman
    Railway
    Raspberry Pi
    Render
    Uninstall
    Updating
    Logging
    Network
    Audio and voice notes
    Camera capture
    Image and media support
    Nodes
    Location command
    Media understanding
    Talk mode
    Node troubleshooting
    Voice wake
    Pi integration architecture
    Pi development workflow
    Android app
    Platforms
    iOS app
    Linux app
    Gateway on macOS
    Canvas
    Gateway lifecycle
    macOS dev setup
    Health checks (macOS)
    Menu bar icon
    macOS logging
    Menu bar
    Peekaboo bridge
    macOS permissions
    Remote control
    macOS signing
    Skills (macOS)
    Voice overlay
    Voice wake (macOS)
    WebChat (macOS)
    macOS IPC
    macOS app
    Windows
    Plugin internals
    Plugin architecture internals
    Building plugins
    Plugin bundles
    Codex Computer Use
    Codex harness
    Community plugins
    Plugin compatibility
    Google Meet plugin
    Plugin hooks
    Plugin manifest
    Memory LanceDB
    Memory wiki
    Message presentation
    Agent harness plugins
    Building channel plugins
    Channel turn kernel
    Plugin entry points
    Plugin SDK migration
    Plugin SDK overview
    Building provider plugins
    Plugin runtime helpers
    Plugin setup and config
    Plugin SDK subpaths
    Plugin testing
    Skill workshop plugin
    Voice call plugin
    Webhooks plugin
    Zalo personal plugin
    OpenProse
    Alibaba Model Studio
    Anthropic
    Arcee AI
    Azure Speech
    Amazon Bedrock
    Amazon Bedrock Mantle
    Chutes
    Claude Max API proxy
    Cloudflare AI gateway
    ComfyUI
    Deepgram
    Deepinfra
    DeepSeek
    ElevenLabs
    Fal
    Fireworks
    GitHub Copilot
    GLM (Zhipu)
    Google (Gemini)
    Gradium
    Groq
    Hugging Face (inference)
    Provider directory
    Inferrs
    Inworld
    Kilocode
    LiteLLM
    LM Studio
    MiniMax
    Mistral
    Model provider quickstart
    Moonshot AI
    NVIDIA
    Ollama
    OpenAI
    OpenCode
    OpenCode Go
    OpenRouter
    Perplexity
    Qianfan
    Qwen
    Runway
    SGLang
    StepFun
    Synthetic
    Tencent Cloud (TokenHub)
    Together AI
    Venice AI
    Vercel AI gateway
    vLLM
    Volcengine (Doubao)
    Vydra
    xAI
    Xiaomi MiMo
    Z.AI
    Default AGENTS.md
    Release policy
    API usage and costs
    Credits
    Device model database
    Full release validation
    Memory configuration reference
    OpenClaw App SDK API design
    Prompt caching
    Rich output protocol
    RPC adapters
    SecretRef credential surface
    Session management deep dive
    AGENTS.md template
    BOOT.md template
    BOOTSTRAP.md template
    HEARTBEAT.md template
    IDENTITY template
    SOUL.md template
    TOOLS.md template
    USER template
    Tests
    Token use and costs
    Transcript hygiene
    Onboarding reference
    Contributing to the threat model
    Threat model (MITRE ATLAS)
    Formal verification (security models)
    Network proxy
    Agent bootstrapping
    Docs directory
    Getting started
    Docs hubs
    OpenClaw lore
    Onboarding (macOS app)
    Onboarding overview
    Personal assistant setup
    Setup
    Showcase
    Onboarding (CLI)
    CLI automation
    CLI setup reference
    ACP agents
    ACP agents — setup
    Agent send
    apply_patch tool
    Brave search
    Browser (OpenClaw-managed)
    Browser control API
    Browser troubleshooting
    Browser login
    WSL2 + Windows + remote Chrome CDP troubleshooting
    BTW side questions
    ClawHub
    Code execution
    Creating skills
    Diffs
    DuckDuckGo search
    Elevated mode
    Exa search
    Exec tool
    Exec approvals
    Exec approvals — advanced
    Firecrawl
    Gemini search
    Grok search
    Image generation
    Tools and plugins
    Kimi search
    LLM task
    Lobster
    Tool-loop detection
    Media overview
    MiniMax search
    Multi-agent sandbox and tools
    Music generation
    Ollama web search
    PDF tool
    Perplexity search
    Plugins
    Reactions
    SearXNG search
    Skills
    Skills config
    Slash commands
    Sub-agents
    Tavily
    Thinking levels
    Tokenjuice
    Trajectory bundles
    Text-to-speech
    Video generation
    Web search
    Web fetch
    Linux server
    Control UI
    Dashboard
    Web
    TUI
    WebChat

    OpenAPI Specs

    openapi
    TaskFlow
    docs/openclaw
    Original Docs

    Real-time Synchronized Documentation

    Last sync: 01/05/2026 07:01:56

    Note: This content is mirrored from docs.openclaw.ai and is subject to their terms and conditions.

    OpenClaw Docs

    v2.4.0 Production

    Last synced: Today, 22:00

    Technical reference for the OpenClaw framework. Real-time synchronization with the official documentation engine.

    Use this file to discover all available pages before exploring further.

    Inference CLI

    text
    openclaw infer
    is the canonical headless surface for provider-backed inference workflows.

    It intentionally exposes capability families, not raw gateway RPC names and not raw agent tool ids.

    Turn infer into a skill

    Copy and paste this to an agent:

    text
    Read https://docs.openclaw.ai/cli/infer, then create a skill that routes my common workflows to `openclaw infer`. Focus on model runs, image generation, video generation, audio transcription, TTS, web search, and embeddings.

    A good infer-based skill should:

    • map common user intents to the correct infer subcommand
    • include a few canonical infer examples for the workflows it covers
    • prefer
      text
      openclaw infer ...
      in examples and suggestions
    • avoid re-documenting the entire infer surface inside the skill body

    Typical infer-focused skill coverage:

    • text
      openclaw infer model run
    • text
      openclaw infer image generate
    • text
      openclaw infer audio transcribe
    • text
      openclaw infer tts convert
    • text
      openclaw infer web search
    • text
      openclaw infer embedding create

    Why use infer

    text
    openclaw infer
    provides one consistent CLI for provider-backed inference tasks inside OpenClaw.

    Benefits:

    • Use the providers and models already configured in OpenClaw instead of wiring up one-off wrappers for each backend.
    • Keep model, image, audio transcription, TTS, video, web, and embedding workflows under one command tree.
    • Use a stable
      text
      --json
      output shape for scripts, automation, and agent-driven workflows.
    • Prefer a first-party OpenClaw surface when the task is fundamentally "run inference."
    • Use the normal local path without requiring the gateway for most infer commands.

    For end-to-end provider checks, prefer

    text
    openclaw infer ...
    once lower-level provider tests are green. It exercises the shipped CLI, config loading, default-agent resolution, bundled plugin activation, runtime-dependency repair, and the shared capability runtime before the provider request is made.

    Command tree

    text
    openclaw infer list inspect model run list inspect providers auth login auth logout auth status image generate edit describe describe-many providers audio transcribe providers tts convert voices providers status enable disable set-provider video generate describe providers web search fetch providers embedding create providers

    Common tasks

    This table maps common inference tasks to the corresponding infer command.

    TaskCommandNotes
    Run a text/model prompt
    text
    openclaw infer model run --prompt "..." --json
    Uses the normal local path by default
    Run a model prompt on images
    text
    openclaw infer model run --prompt "Describe this" --file ./image.png --model provider/model
    Repeat
    text
    --file
    for multiple image inputs
    Generate an image
    text
    openclaw infer image generate --prompt "..." --json
    Use
    text
    image edit
    when starting from an existing file
    Describe an image file
    text
    openclaw infer image describe --file ./image.png --prompt "..." --json
    text
    --model
    must be an image-capable
    text
    <provider/model>
    Transcribe audio
    text
    openclaw infer audio transcribe --file ./memo.m4a --json
    text
    --model
    must be
    text
    <provider/model>
    Synthesize speech
    text
    openclaw infer tts convert --text "..." --output ./speech.mp3 --json
    text
    tts status
    is gateway-oriented
    Generate a video
    text
    openclaw infer video generate --prompt "..." --json
    Supports provider hints such as
    text
    --resolution
    Describe a video file
    text
    openclaw infer video describe --file ./clip.mp4 --json
    text
    --model
    must be
    text
    <provider/model>
    Search the web
    text
    openclaw infer web search --query "..." --json
    Fetch a web page
    text
    openclaw infer web fetch --url https://example.com --json
    Create embeddings
    text
    openclaw infer embedding create --text "..." --json

    Behavior

    • text
      openclaw infer ...
      is the primary CLI surface for these workflows.
    • Use
      text
      --json
      when the output will be consumed by another command or script.
    • Use
      text
      --provider
      or
      text
      --model provider/model
      when a specific backend is required.
    • For
      text
      image describe
      ,
      text
      audio transcribe
      , and
      text
      video describe
      ,
      text
      --model
      must use the form
      text
      <provider/model>
      .
    • For
      text
      image describe
      , an explicit
      text
      --model
      runs that provider/model directly. The model must be image-capable in the model catalog or provider config.
      text
      codex/<model>
      runs a bounded Codex app-server image-understanding turn;
      text
      openai-codex/<model>
      uses the OpenAI Codex OAuth provider path.
    • Stateless execution commands default to local.
    • Gateway-managed state commands default to gateway.
    • The normal local path does not require the gateway to be running.
    • Local
      text
      model run
      is a lean one-shot provider completion. It resolves the configured agent model and auth, but does not start a chat-agent turn, load tools, or open bundled MCP servers.
    • text
      model run --file
      accepts image files, detects their MIME type, and sends them with the supplied prompt to the selected model. Repeat
      text
      --file
      for multiple images.
    • text
      model run --file
      rejects non-image inputs. Use
      text
      infer audio transcribe
      for audio files and
      text
      infer video describe
      for video files.
    • text
      model run --gateway
      exercises Gateway routing, saved auth, provider selection, and the embedded runtime, but still runs as a raw model probe: it sends the supplied prompt and any image attachments without prior session transcript, bootstrap/AGENTS context, context-engine assembly, tools, or bundled MCP servers.
    • text
      model run --gateway --model <provider/model>
      requires a trusted operator gateway credential because the request asks the Gateway to run a one-off provider/model override.

    Model

    Use

    text
    model
    for provider-backed text inference and model/provider inspection.

    bash
    openclaw infer model run --prompt "Reply with exactly: smoke-ok" --json openclaw infer model run --prompt "Summarize this changelog entry" --model openai/gpt-5.4 --json openclaw infer model run --prompt "Describe this image in one sentence" --file ./photo.jpg --model google/gemini-2.5-flash --json openclaw infer model providers --json openclaw infer model inspect --name gpt-5.5 --json

    Use full

    text
    <provider/model>
    refs to smoke-test a specific provider without starting the Gateway or loading the full agent tool surface:

    bash
    openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model cerebras/zai-glm-4.7 --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model google/gemini-2.5-flash --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model groq/llama-3.1-8b-instant --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model mistral/mistral-small-latest --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model openai/gpt-4.1 --prompt "Reply with exactly: pong" --json openclaw infer model run --local --model ollama/qwen2.5vl:7b --prompt "Describe this image." --file ./photo.jpg --json

    Notes:

    • Local
      text
      model run
      is the narrowest CLI smoke for provider/model/auth health because it sends only the supplied prompt to the selected model.
    • Local
      text
      model run --file
      keeps that lean path and attaches image content directly to the single user message. Common image files such as PNG, JPEG, and WebP work when their MIME type is detected as
      text
      image/*
      ; unsupported or unrecognized files fail before the provider is called.
    • text
      model run --file
      is best when you want to test the selected multimodal text model directly. Use
      text
      infer image describe
      when you want OpenClaw's image-understanding provider selection and default image-model routing.
    • The selected model must support image input; text-only models may reject the request at the provider layer.
    • text
      model run --prompt
      must contain non-whitespace text; empty prompts are rejected before local providers or the Gateway are called.
    • Local
      text
      model run
      exits non-zero when the provider returns no text output, so unreachable local providers and empty completions do not look like successful probes.
    • Use
      text
      model run --gateway
      when you need to test Gateway routing, agent-runtime setup, or Gateway-managed provider state while keeping the model input raw. Use
      text
      openclaw agent
      or chat surfaces when you want the full agent context, tools, memory, and session transcript.
    • text
      model auth login
      ,
      text
      model auth logout
      , and
      text
      model auth status
      manage saved provider auth state.

    Image

    Use

    text
    image
    for generation, edit, and description.

    bash
    openclaw infer image generate --prompt "friendly lobster illustration" --json openclaw infer image generate --prompt "cinematic product photo of headphones" --json openclaw infer image generate --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "simple red circle sticker on a transparent background" --json openclaw infer image generate --prompt "slow image backend" --timeout-ms 180000 --json openclaw infer image edit --file ./logo.png --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "keep the logo, remove the background" --json openclaw infer image edit --file ./poster.png --prompt "make this a vertical story ad" --size 2160x3840 --aspect-ratio 9:16 --resolution 4K --json openclaw infer image describe --file ./photo.jpg --json openclaw infer image describe --file ./receipt.jpg --prompt "Extract the merchant, date, and total" --json openclaw infer image describe-many --file ./before.png --file ./after.png --prompt "Compare the screenshots and list visible UI changes" --json openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --prompt "Describe the image in one sentence" --timeout-ms 300000 --json

    Notes:

    • Use

      text
      image edit
      when starting from existing input files.

    • Use

      text
      --size
      ,
      text
      --aspect-ratio
      , or
      text
      --resolution
      with
      text
      image edit
      for providers/models that support geometry hints on reference-image edits.

    • Use

      text
      --output-format png --background transparent
      with
      text
      --model openai/gpt-image-1.5
      for transparent-background OpenAI PNG output;
      text
      --openai-background
      remains available as an OpenAI-specific alias. Providers that do not declare background support report the hint as an ignored override.

    • Use

      text
      image providers --json
      to verify which bundled image providers are discoverable, configured, selected, and which generation/edit capabilities each provider exposes.

    • Use

      text
      image generate --model <provider/model> --json
      as the narrowest live CLI smoke for image generation changes. Example:

      bash
      openclaw infer image providers --json openclaw infer image generate \ --model google/gemini-3.1-flash-image-preview \ --prompt "Minimal flat test image: one blue square on a white background, no text." \ --output ./openclaw-infer-image-smoke.png \ --json

      The JSON response reports

      text
      ok
      ,
      text
      provider
      ,
      text
      model
      ,
      text
      attempts
      , and written output paths. When
      text
      --output
      is set, the final extension may follow the provider's returned MIME type.

    • For

      text
      image describe
      and
      text
      image describe-many
      , use
      text
      --prompt
      to give the vision model a task-specific instruction such as OCR, comparison, UI inspection, or concise captioning.

    • Use

      text
      --timeout-ms
      with slow local vision models or cold Ollama starts.

    • For

      text
      image describe
      ,
      text
      --model
      must be an image-capable
      text
      <provider/model>
      .

    • For local Ollama vision models, pull the model first and set

      text
      OLLAMA_API_KEY
      to any placeholder value, for example
      text
      ollama-local
      . See Ollama.

    Audio

    Use

    text
    audio
    for file transcription.

    bash
    openclaw infer audio transcribe --file ./memo.m4a --json openclaw infer audio transcribe --file ./team-sync.m4a --language en --prompt "Focus on names and action items" --json openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json

    Notes:

    • text
      audio transcribe
      is for file transcription, not realtime session management.
    • text
      --model
      must be
      text
      <provider/model>
      .

    TTS

    Use

    text
    tts
    for speech synthesis and TTS provider state.

    bash
    openclaw infer tts convert --text "hello from openclaw" --output ./hello.mp3 --json openclaw infer tts convert --text "Your build is complete" --output ./build-complete.mp3 --json openclaw infer tts providers --json openclaw infer tts status --json

    Notes:

    • text
      tts status
      defaults to gateway because it reflects gateway-managed TTS state.
    • Use
      text
      tts providers
      ,
      text
      tts voices
      , and
      text
      tts set-provider
      to inspect and configure TTS behavior.

    Video

    Use

    text
    video
    for generation and description.

    bash
    openclaw infer video generate --prompt "cinematic sunset over the ocean" --json openclaw infer video generate --prompt "slow drone shot over a forest lake" --resolution 768P --duration 6 --json openclaw infer video describe --file ./clip.mp4 --json openclaw infer video describe --file ./clip.mp4 --model openai/gpt-4.1-mini --json

    Notes:

    • text
      video generate
      accepts
      text
      --size
      ,
      text
      --aspect-ratio
      ,
      text
      --resolution
      ,
      text
      --duration
      ,
      text
      --audio
      ,
      text
      --watermark
      , and
      text
      --timeout-ms
      and forwards them to the video-generation runtime.
    • text
      --model
      must be
      text
      <provider/model>
      for
      text
      video describe
      .

    Web

    Use

    text
    web
    for search and fetch workflows.

    bash
    openclaw infer web search --query "OpenClaw docs" --json openclaw infer web search --query "OpenClaw infer web providers" --json openclaw infer web fetch --url https://docs.openclaw.ai/cli/infer --json openclaw infer web providers --json

    Notes:

    • Use
      text
      web providers
      to inspect available, configured, and selected providers.

    Embedding

    Use

    text
    embedding
    for vector creation and embedding provider inspection.

    bash
    openclaw infer embedding create --text "friendly lobster" --json openclaw infer embedding create --text "customer support ticket: delayed shipment" --model openai/text-embedding-3-large --json openclaw infer embedding providers --json

    JSON output

    Infer commands normalize JSON output under a shared envelope:

    json
    { "ok": true, "capability": "image.generate", "transport": "local", "provider": "openai", "model": "gpt-image-2", "attempts": [], "outputs": [] }

    Top-level fields are stable:

    • text
      ok
    • text
      capability
    • text
      transport
    • text
      provider
    • text
      model
    • text
      attempts
    • text
      outputs
    • text
      error

    For generated media commands,

    text
    outputs
    contains files written by OpenClaw. Use the
    text
    path
    ,
    text
    mimeType
    ,
    text
    size
    , and any media-specific dimensions in that array for automation instead of parsing human-readable stdout.

    Common pitfalls

    bash
    # Bad openclaw infer media image generate --prompt "friendly lobster" # Good openclaw infer image generate --prompt "friendly lobster"
    bash
    # Bad openclaw infer audio transcribe --file ./memo.m4a --model whisper-1 --json # Good openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json

    Notes

    • text
      openclaw capability ...
      is an alias for
      text
      openclaw infer ...
      .

    Related

    • CLI reference
    • Models

    © 2024 TaskFlow Mirror

    Powered by TaskFlow Sync Engine