Commit Graph

7 Commits

Author SHA1 Message Date
Peter Steinberger
ab8b8dae70 refactor(agents): dedupe model and tool test helpers 2026-03-02 21:31:36 +00:00
Peter Steinberger
4b50018406 fix: restore helper imports and plugin hook test exports 2026-03-02 19:57:33 +00:00
Peter Steinberger
7003615972 fix: resolve rebase conflict markers 2026-03-02 19:57:33 +00:00
Peter Steinberger
9617ac9dd5 refactor: dedupe agent and reply runtimes 2026-03-02 19:57:33 +00:00
justinhuangcode
14baadda2c fix(tools): honor fsPolicy.workspaceOnly in image/pdf tool localRoots
PR #28822 fixed the Write/Edit tools to respect `tools.fs.workspaceOnly`,
but the image and PDF tools still unconditionally include default local
roots (`~/.openclaw/media`, `~/.openclaw/agents`, etc.) when computing
the `localRoots` allowlist for non-sandbox mode.

When `fsPolicy.workspaceOnly` is true, restrict `localRoots` to only the
workspace directory so that files outside the workspace are rejected by
`assertLocalMediaAllowed()`.

Relates to #31716

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 19:24:33 +00:00
Peter Steinberger
c3948800f4 refactor(agents): extract shared tool model helpers 2026-03-02 07:13:11 +00:00
Tyler Yust
d0ac1b0195 feat: add PDF analysis tool with native provider support (#31319)
* feat: add PDF analysis tool with native provider support

New `pdf` tool for analyzing PDF documents with model-powered analysis.

Architecture:
- Native PDF path: sends raw PDF bytes directly to providers that support
  inline document input (Anthropic via DocumentBlockParam, Google Gemini
  via inlineData with application/pdf MIME type)
- Extraction fallback: for providers without native PDF support, extracts
  text via pdfjs-dist and rasterizes pages to images via @napi-rs/canvas,
  then sends through the standard vision/text completion path

Key features:
- Single PDF (`pdf` param) or multiple PDFs (`pdfs` array, up to 10)
- Page range selection (`pages` param, e.g. "1-5", "1,3,7-9")
- Model override (`model` param) and file size limits (`maxBytesMb`)
- Auto-detects provider capability and falls back gracefully
- Same security patterns as image tool (SSRF guards, sandbox support,
  local path roots, workspace-only policy)

Config (agents.defaults):
- pdfModel: primary/fallbacks (defaults to imageModel, then session model)
- pdfMaxBytesMb: max PDF file size (default: 10)
- pdfMaxPages: max pages to process (default: 20)

Model catalog:
- Extended ModelInputType to include "document" alongside "text"/"image"
- Added modelSupportsDocument() capability check

Files:
- src/agents/tools/pdf-tool.ts - main tool factory
- src/agents/tools/pdf-tool.helpers.ts - helpers (page range, config, etc.)
- src/agents/tools/pdf-native-providers.ts - direct API calls for Anthropic/Google
- src/agents/tools/pdf-tool.test.ts - 43 tests covering all paths
- Modified: model-catalog.ts, openclaw-tools.ts, config schema/types/labels/help

* fix: prepare pdf tool for merge (#31319) (thanks @tyler6204)
2026-03-01 22:39:12 -08:00