Caricamento in corso...
Caricamento in corso...
Last synced: Today, 22:00
Technical reference for the OpenClaw framework. Real-time synchronization with the official documentation engine.
Use this file to discover all available pages before exploring further.
OpenClaw agents can generate videos from text prompts, reference images, or existing videos. Sixteen provider backends are supported, each with different model options, input modes, and feature sets. The agent picks the right provider automatically based on your configuration and available API keys.
OpenClaw treats video generation as three runtime modes:
generateimageToVideovideoToVideoProviders can support any subset of those modes. The tool validates the active mode before submission and reports supported modes in
action=listtext```bash} export GEMINI_API_KEY="your-key" ```
textThe agent calls `video_generate` automatically. No tool allowlisting is needed.
Video generation is asynchronous. When the agent calls
video_generateWhile a job is in flight, duplicate
video_generateopenclaw tasks listopenclaw tasks show <taskId>Outside of session-backed agent runs (for example, direct tool invocations), the tool falls back to inline generation and returns the final media path in the same turn.
Generated video files are saved under OpenClaw-managed media storage when the provider returns bytes. The default generated-video save cap follows the video media limit, and
agents.defaults.mediaMaxMb| State | Meaning |
|---|---|
text queued | Task created, waiting for the provider to accept it. |
text running | Provider is processing (typically 30 seconds to 5 minutes depending on provider and resolution). |
text succeeded | Video ready; the agent wakes and posts it to the conversation. |
text failed | Provider error or timeout; the agent wakes with error details. |
Check status from the CLI:
bashopenclaw tasks list openclaw tasks show <taskId> openclaw tasks cancel <taskId>
If a video task is already
queuedrunningvideo_generateaction: "status"| Provider | Default model | Text | Image ref | Video ref | Auth |
|---|---|---|---|---|---|
| Alibaba | text wan2.6-t2v | ✓ | Yes (remote URL) | Yes (remote URL) | text MODELSTUDIO_API_KEY |
| BytePlus (1.0) | text seedance-1-0-pro-250528 | ✓ | Up to 2 images (I2V models only; first + last frame) | — | text BYTEPLUS_API_KEY |
| BytePlus Seedance 1.5 | text seedance-1-5-pro-251215 | ✓ | Up to 2 images (first + last frame via role) | — | text BYTEPLUS_API_KEY |
| BytePlus Seedance 2.0 | text dreamina-seedance-2-0-260128 | ✓ | Up to 9 reference images | Up to 3 videos | text BYTEPLUS_API_KEY |
| ComfyUI | text workflow | ✓ | 1 image | — | text COMFY_API_KEYtext COMFY_CLOUD_API_KEY |
| DeepInfra | text Pixverse/Pixverse-T2V | ✓ | — | — | text DEEPINFRA_API_KEY |
| fal | text fal-ai/minimax/video-01-live | ✓ | 1 image; up to 9 with Seedance reference-to-video | Up to 3 videos with Seedance reference-to-video | text FAL_KEY |
text veo-3.1-fast-generate-preview | ✓ | 1 image | 1 video | text GEMINI_API_KEY | |
| MiniMax | text MiniMax-Hailuo-2.3 | ✓ | 1 image | — | text MINIMAX_API_KEY |
| OpenAI | text sora-2 | ✓ | 1 image | 1 video | text OPENAI_API_KEY |
| OpenRouter | text google/veo-3.1-fast | ✓ | Up to 4 images (first/last frame or references) | — | text OPENROUTER_API_KEY |
| Qwen | text wan2.6-t2v | ✓ | Yes (remote URL) | Yes (remote URL) | text QWEN_API_KEY |
| Runway | text gen4.5 | ✓ | 1 image | 1 video | text RUNWAYML_API_SECRET |
| Together | text Wan-AI/Wan2.2-T2V-A14B | ✓ | 1 image | — | text TOGETHER_API_KEY |
| Vydra | text veo3 | ✓ | 1 image ( text kling | — | text VYDRA_API_KEY |
| xAI | text grok-imagine-video | ✓ | 1 first-frame image or up to 7 text reference_image | 1 video | text XAI_API_KEY |
Some providers accept additional or alternate API key env vars. See individual provider pages for details.
Run
video_generate action=listThe explicit mode contract used by
video_generate| Provider | text generate | text imageToVideo | text videoToVideo | Shared live lanes today |
|---|---|---|---|---|
| Alibaba | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideotext http(s) |
| BytePlus | ✓ | ✓ | — | text generatetext imageToVideo |
| ComfyUI | ✓ | ✓ | — | Not in the shared sweep; workflow-specific coverage lives with Comfy tests |
| DeepInfra | ✓ | — | — | text generate |
| fal | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideo |
| ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideo | |
| MiniMax | ✓ | ✓ | — | text generatetext imageToVideo |
| OpenAI | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideo |
| OpenRouter | ✓ | ✓ | — | text generatetext imageToVideo |
| Qwen | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideotext http(s) |
| Runway | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideotext runway/gen4_aleph |
| Together | ✓ | ✓ | — | text generatetext imageToVideo |
| Vydra | ✓ | ✓ | — | text generatetext imageToVideotext veo3text kling |
| xAI | ✓ | ✓ | ✓ | text generatetext imageToVideotext videoToVideo |
480P720P768P1080P
adaptiveadaptivedetails.ignoredOverridesrunway/gen4.5
Reference inputs select the runtime mode:
generateimageToVideovideoToVideomaxInputAudiosMixed image and video references are not a stable shared capability surface. Prefer one reference type per request.
Some capability checks are applied at the fallback layer rather than the tool boundary, so a request that exceeds the primary provider's limits can still run on a capable fallback:
maxInputAudios0maxDurationSecondsdurationSecondssupportedDurationSecondsproviderOptionsproviderOptionscapabilities.providerOptions: {}The first skip reason in a request logs at
warndebug| Action | What it does |
|---|---|
text generate | Default. Create a video from the given prompt and optional reference inputs. |
text status | Check the state of the in-flight video task for the current session without starting another generation. |
text list | Show available providers, models, and their capabilities. |
OpenClaw resolves the model in this order:
modelvideoGenerationModel.primaryvideoGenerationModel.fallbacksIf a provider fails, the next candidate is tried automatically. If all candidates fail, the error includes details from each attempt.
Set
agents.defaults.mediaGenerationAutoProviderFallback: falsemodelprimaryfallbacksjson5{ agents: { defaults: { videoGenerationModel: { primary: "google/veo-3.1-fast-generate-preview", fallbacks: ["runway/gen4.5", "qwen/wan2.6-t2v"], }, }, }, }
The shared video-generation contract supports mode-specific capabilities instead of only flat aggregate limits. New provider implementations should prefer explicit mode blocks:
typescriptcapabilities: { generate: { maxVideos: 1, maxDurationSeconds: 10, supportsResolution: true, }, imageToVideo: { enabled: true, maxVideos: 1, maxInputImages: 1, maxInputImagesByModel: { "provider/reference-to-video": 9 }, maxDurationSeconds: 5, }, videoToVideo: { enabled: true, maxVideos: 1, maxInputVideos: 1, maxDurationSeconds: 5, }, }
Flat aggregate fields such as
maxInputImagesmaxInputVideosgenerateimageToVideovideoToVideovideo_generateWhen one model in a provider has wider reference-input support than the rest, use
maxInputImagesByModelmaxInputVideosByModelmaxInputAudiosByModelOpt-in live coverage for the shared bundled providers:
bashOPENCLAW_LIVE_TEST=1 pnpm test:live -- extensions/video-generation-providers.live.test.ts
Repo wrapper:
bashpnpm test:live:media video
This live file loads missing provider env vars from
~/.profilegenerateOPENCLAW_LIVE_VIDEO_GENERATION_TIMEOUT_MS180000FAL is opt-in because provider-side queue latency can dominate release time:
bashpnpm test:live:media video --video-providers fal
Set
OPENCLAW_LIVE_VIDEO_GENERATION_FULL_MODES=1imageToVideocapabilities.imageToVideo.enabledvideoToVideocapabilities.videoToVideo.enabledToday the shared
videoToVideorunwayrunway/gen4_alephSet the default video-generation model in your OpenClaw config:
json5{ agents: { defaults: { videoGenerationModel: { primary: "qwen/wan2.6-t2v", fallbacks: ["qwen/wan2.6-r2v-flash"], }, }, }, }
Or via the CLI:
bashopenclaw config set agents.defaults.videoGenerationModel.primary "qwen/wan2.6-t2v"
© 2024 TaskFlow Mirror
Powered by TaskFlow Sync Engine