Use this file to discover all available pages before exploring further.
Talk mode
Talk mode is a continuous voice conversation loop:
- Listen for speech
- Send transcript to the model (main session, chat.send)
- Wait for the response
- Speak it via the configured Talk provider ()
Behavior (macOS)
- Always-on overlay while Talk mode is enabled.
- Listening → Thinking → Speaking phase transitions.
- On a short pause (silence window), the current transcript is sent.
- Replies are written to WebChat (same as typing).
- Interrupt on speech (default on): if the user starts talking while the assistant is speaking, we stop playback and note the interruption timestamp for the next prompt.
Voice directives in replies
The assistant may prefix its reply with a single JSON line to control voice:
{ "voice": "<voice-id>", "once": true }
Rules:
- First non-empty line only.
- Unknown keys are ignored.
- applies to the current reply only.
- Without , the voice becomes the new default for Talk mode.
- The JSON line is stripped before TTS playback.
Supported keys:
- / /
- / /
- , (WPM), , , ,
- , , , ,
Config (~/.openclaw/openclaw.json
)
{
talk: {
provider: "elevenlabs",
providers: {
elevenlabs: {
voiceId: "elevenlabs_voice_id",
modelId: "eleven_v3",
outputFormat: "mp3_44100_128",
apiKey: "elevenlabs_api_key",
},
mlx: {
modelId: "mlx-community/Soprano-80M-bf16",
},
system: {},
},
speechLocale: "ru-RU",
silenceTimeoutMs: 1500,
interruptOnSpeech: true,
},
}
Defaults:
- : true
- : when unset, Talk keeps the platform default pause window before sending the transcript (
700 ms on macOS and Android, 900 ms on iOS
)
- : selects the active Talk provider. Use , , or for the macOS-local playback paths.
providers.<provider>.voiceId
: falls back to / for ElevenLabs (or first ElevenLabs voice when API key is available).
providers.elevenlabs.modelId
: defaults to when unset.
- : defaults to
mlx-community/Soprano-80M-bf16
when unset.
providers.elevenlabs.apiKey
: falls back to (or gateway shell profile if available).
- : optional BCP 47 locale id for on-device Talk speech recognition on iOS/macOS. Leave unset to use the device default.
- : defaults to on macOS/iOS and on Android (set to force MP3 streaming)
macOS UI
- Menu bar toggle: Talk
- Config tab: Talk Mode group (voice id + interrupt toggle)
- Overlay:
- Listening: cloud pulses with mic level
- Thinking: sinking animation
- Speaking: radiating rings
- Click cloud: stop speaking
- Click X: exit Talk mode
Android UI
- Voice tab toggle: Talk
- Manual Mic and Talk are mutually exclusive runtime capture modes.
- Manual Mic stops when the app leaves the foreground or the user leaves the Voice tab.
- Talk Mode keeps running until toggled off or the Android node disconnects, and uses Android's microphone foreground-service type while active.
Notes
- Requires Speech + Microphone permissions.
- Uses against session key .
- The gateway resolves Talk playback through using the active Talk provider. Android falls back to local system TTS only when that RPC is unavailable.
- macOS local MLX playback uses the bundled helper when present, or an executable on . Set to point at a custom helper binary during development.
- for is validated to , , or ; other models accept .
- is validated to when set.
- Android supports , , , and output formats for low-latency AudioTrack streaming.
Related