Use this file to discover all available pages before exploring further.

PDF tool

text

pdf

analyzes one or more PDF documents and returns text.

Quick behavior:

Native provider mode for Anthropic and Google model providers.
Extraction fallback mode for other providers (extract text first, then page images when needed).
Supports single (
text
pdf
) or multi (
text
pdfs
) input, max 10 PDFs per call.

Availability

The tool is only registered when OpenClaw can resolve a PDF-capable model config for the agent:

text
agents.defaults.pdfModel
fallback to
text
agents.defaults.imageModel
fallback to the agent's resolved session/default model
if native-PDF providers are auth-backed, prefer them ahead of generic image fallback candidates

If no usable model can be resolved, the

text

pdf

tool is not exposed.

Availability notes:

The fallback chain is auth-aware. A configured
text
provider/model
only counts if OpenClaw can actually authenticate that provider for the agent.
Native PDF providers are currently Anthropic and Google.
If the resolved session/default provider already has a configured vision/PDF model, the PDF tool reuses that before falling back to other auth-backed providers.

Input reference

One PDF path or URL. Multiple PDF paths or URLs, up to 10 total. Analysis prompt. Page filter like `1-5` or `1,3,7-9`. Optional model override in `provider/model` form. Per-PDF size cap in MB. Defaults to `agents.defaults.pdfMaxBytesMb` or `10`.

Input notes:

text
pdf
and
text
pdfs
are merged and deduplicated before loading.
If no PDF input is provided, the tool errors.
text
pages
is parsed as 1-based page numbers, deduped, sorted, and clamped to the configured max pages.
text
maxBytesMb
defaults to
text
agents.defaults.pdfMaxBytesMb
or
text
10
.

Supported PDF references

local file path (including
text
~
expansion)
text
file://
URL
text
http://
and
text
https://
URL
OpenClaw-managed inbound refs such as
text
media://inbound/<id>

Reference notes:

Other URI schemes (for example
text
ftp://
) are rejected with
text
unsupported_pdf_reference
.
In sandbox mode, remote
text
http(s)
URLs are rejected.
With workspace-only file policy enabled, local file paths outside allowed roots are rejected.
Managed inbound refs and replayed paths under OpenClaw's inbound media store are allowed with workspace-only file policy.

Execution modes

Native provider mode

Native mode is used for provider

text

anthropic

and

text

google

. The tool sends raw PDF bytes directly to provider APIs.

Native mode limits:

text
pages
is not supported. If set, the tool returns an error.
Multi-PDF input is supported; each PDF is sent as a native document block / inline PDF part before the prompt.

Extraction fallback mode

Fallback mode is used for non-native providers.

Flow:

Extract text from selected pages (up to
text
agents.defaults.pdfMaxPages
, default
text
20
).
If extracted text length is below
text
200
chars, render selected pages to PNG images and include them.
Send extracted content plus prompt to the selected model.

Fallback details:

Page image extraction uses a pixel budget of
text
4,000,000
.
If the target model does not support image input and there is no extractable text, the tool errors.
If text extraction succeeds but image extraction would require vision on a text-only model, OpenClaw drops the rendered images and continues with the extracted text.
Extraction fallback uses the bundled
text
document-extract
plugin. The plugin owns
text
pdfjs-dist
;
text
@napi-rs/canvas
is used only when image rendering fallback is available.

Config


json5
{
  agents: {
    defaults: {
      pdfModel: {
        primary: "anthropic/claude-opus-4-6",
        fallbacks: ["openai/gpt-5.4-mini"],
      },
      pdfMaxBytesMb: 10,
      pdfMaxPages: 20,
    },
  },
}

See Configuration Reference for full field details.

Output details

The tool returns text in

text

content[0].text

and structured metadata in

text

details

Common

text

details

fields:

text
model
: resolved model ref (
text
provider/model
)
text
native
:
text
true
for native provider mode,
text
false
for fallback
text
attempts
: fallback attempts that failed before success

Path fields:

single PDF input:
text
details.pdf
multiple PDF inputs:
text
details.pdfs[]
with
text
pdf
entries
sandbox path rewrite metadata (when applicable):
text
rewrittenFrom

Error behavior

Missing PDF input: throws
text
pdf required: provide a path or URL to a PDF document
Too many PDFs: returns structured error in
text
details.error = "too_many_pdfs"
Unsupported reference scheme: returns
text
details.error = "unsupported_pdf_reference"
Native mode with
text
pages
: throws clear
text
pages is not supported with native PDF providers
error

Examples

Single PDF:


json
{
  "pdf": "/tmp/report.pdf",
  "prompt": "Summarize this report in 5 bullets"
}

Multiple PDFs:


json
{
  "pdfs": ["/tmp/q1.pdf", "/tmp/q2.pdf"],
  "prompt": "Compare risks and timeline changes across both documents"
}

Page-filtered fallback model:


json
{
  "pdf": "https://example.com/report.pdf",
  "pages": "1-3,7",
  "model": "openai/gpt-5.4-mini",
  "prompt": "Extract only customer-impacting incidents"
}

Tools Overview — all available agent tools
Configuration Reference — pdfMaxBytesMb and pdfMaxPages config

OpenClaw Docs

PDF tool

Availability

Input reference

Supported PDF references

Execution modes

Native provider mode

Extraction fallback mode

Config

Output details

Error behavior

Examples

Related