Commit Graph

65 Commits

Author SHA1 Message Date
Frank Yang
d68d4362ee fix(context-pruning): cover image-only tool-result pruning 2026-03-11 18:07:37 +08:00
MoerAI
a78674f115 fix(context-pruning): prune image-containing tool results instead of skipping them (#41789) 2026-03-11 18:07:37 +08:00
Vincent Koc
e4d80ed556 CI: restore main detect-secrets scan (#38438)
* Tests: stabilize detect-secrets fixtures

* Tests: fix rebased detect-secrets false positives

* Docs: keep snippets valid under detect-secrets

* Tests: finalize detect-secrets false-positive fixes

* Tests: reduce detect-secrets false positives

* Tests: keep detect-secrets pragmas inline

* Tests: remediate next detect-secrets batch

* Tests: tighten detect-secrets allowlists

* Tests: stabilize detect-secrets formatter drift
2026-03-07 10:06:35 -08:00
Rodrigo Uroz
4c0b873a4d Config/Compaction: expose safeguard preserve and quality settings (#25557)
Merged via squash.

Prepared head SHA: ea9904039a35d8d9ced55cd6d1c459a46666954d
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-07 07:13:13 -08:00
Rodrigo Uroz
036c329716 Compaction/Safeguard: add summary quality audit retries (#25556)
Merged via squash.

Prepared head SHA: be473efd1635616ebbae6e649d542ed50b4a827f
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-05 13:39:25 -08:00
Sid
463fd4735e fix(agents): guard context pruning against malformed thinking blocks (#35146)
Merged via squash.

Prepared head SHA: a196a565b1b8e806ffbf85172bcf1128796b45a2
Co-authored-by: Sid-Qin <201593046+Sid-Qin@users.noreply.github.com>
Co-authored-by: shakkernerd <165377636+shakkernerd@users.noreply.github.com>
Reviewed-by: @shakkernerd
2026-03-05 05:52:24 +00:00
青雲
96021a2b17 fix: align AGENTS.md template section names with post-compaction extraction (#25029) (#25098)
Merged via squash.

Prepared head SHA: 8cd6cc8049aab5a94d8a9d5fb08f2e792c4ac5fd
Co-authored-by: echoVic <16428813+echoVic@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-04 12:16:00 -08:00
Rodrigo Uroz
df0f2e349f Compaction/Safeguard: require structured summary headings (#25555)
Merged via squash.

Prepared head SHA: 0b1df34806a7b788261290be55760fd89220de53
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-04 10:54:42 -08:00
Rodrigo Uroz
c8b45a4c5c Compaction/Safeguard: preserve recent turns verbatim (#25554)
Merged via squash.

Prepared head SHA: 7fb33c411c4aaea2795e490fcd0e647cf7ea6fb8
Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
2026-03-03 07:00:49 -08:00
Peter Steinberger
70db52de71 test(agents): centralize AgentMessage fixtures and remove unsafe casts 2026-03-03 02:14:15 +00:00
Peter Steinberger
61adcea68e fix(test): tighten tool result typing in context pruning tests 2026-03-03 01:18:29 +00:00
Peter Steinberger
1b4062defd refactor(tests): dedupe pi embedded test harness 2026-03-03 01:15:09 +00:00
Peter Steinberger
fd3ca8a34c refactor: dedupe agent and browser cli helpers 2026-03-03 00:15:00 +00:00
Peter Steinberger
ab8b8dae70 refactor(agents): dedupe model and tool test helpers 2026-03-02 21:31:36 +00:00
Peter Steinberger
07b16d5ad0 fix(security): harden workspace bootstrap boundary reads 2026-03-02 17:07:36 +00:00
Rodrigo Uroz
0fe6cf06b2 Compaction: preserve opaque identifiers in summaries (openclaw#25553) thanks @rodrigouroz
Verified:
- pnpm install --frozen-lockfile
- pnpm build
- pnpm check
- pnpm test:macmini

Co-authored-by: rodrigouroz <384037+rodrigouroz@users.noreply.github.com>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
2026-02-27 08:14:05 -06:00
Peter Steinberger
0ec7711bc2 fix(agents): harden compaction and reset safety
Co-authored-by: jaden-clovervnd <91520439+jaden-clovervnd@users.noreply.github.com>
Co-authored-by: Sid <201593046+Sid-Qin@users.noreply.github.com>
Co-authored-by: Marcus Widing <245375637+widingmarcus-cyber@users.noreply.github.com>
2026-02-26 17:41:24 +01:00
Peter Steinberger
75423a00d6 refactor: deduplicate shared helpers and test setup 2026-02-23 20:40:44 +00:00
DukeDeSouth
ea47ab29bd fix: cancel compaction instead of truncating history when summarization fails (#10711)
* fix: cancel compaction instead of truncating history when summarization fails

When the compaction safeguard cannot generate a summary (no model, no API
key, or LLM error), it previously returned a "Summary unavailable" fallback
string and still truncated history. This caused irreversible data loss -
older messages were discarded even though no meaningful summary was produced.

Now returns `{ cancel: true }` in all three failure paths so the framework
aborts compaction entirely and preserves the full conversation history.

Fixes #10332

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix: use deterministic timestamps in compaction safeguard tests

Replace Date.now() with fixed timestamp (0) in test data to prevent
nondeterministic behavior in snapshot-based or order-dependent tests.

Co-authored-by: Cursor <cursoragent@cursor.com>

* Changelog: note compaction cancellation safeguard fix

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:23:13 -05:00
Owen
01380f49f5 fix(compaction): pass model through runtime for safeguard summaries (#17864)
* fix(compaction): pass model through runtime to fix ctx.model undefined

Fixes #3479

Root cause: extensionRunner.initialize() is never called in compact.ts workflow,
leaving ctx.model undefined. Compaction safeguard checks ctx.model and returns
fallback summary immediately without attempting LLM summarization.

Changes:
1. Pass model through compaction safeguard runtime registry (same pattern as maxHistoryShare)
2. Fall back to runtime.model when ctx.model is undefined
3. Add once-per-session warning when both models are missing (prevents log spam)
4. Add regression test for runtime.model fallback

This follows the established runtime registry pattern rather than attempting to call
extensionRunner.initialize() (which is SDK-internal and not meant for direct access).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* test: add comprehensive tests for compaction-safeguard model fallback

Add integration tests to verify the model fallback behavior:
- Test runtime.model fallback when ctx.model is undefined (compact.ts workflow)
- Test fallback summary when both ctx.model and runtime.model are undefined
- Test contextWindowTokens runtime storage/retrieval
- Test combined runtime values (maxHistoryShare + contextWindowTokens + model)

These tests verify the fix for issue #3479 where compaction fails due to
ctx.model being undefined in the compact.ts workflow. The runtime registry
pattern allows model to be passed when extensionRunner.initialize() is not
called, ensuring summarization works in all code paths.

Related: PR #17864

* fix(test): adapt compaction-safeguard tests to upstream type changes

- Add baseUrl to Model mock objects (now required by Model<Api>)
- Add explicit Model<Api> annotation to prevent provider string widening
- Cast modelRegistry mock through unknown (ModelRegistry expanded)
- Use non-null assertion for compactionHandler (TypeScript strict)
- Type compaction result explicitly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Compaction: add changelog credit for model fallback fix

* Update CHANGELOG.md

* Update CHANGELOG.md

---------

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
2026-02-23 10:14:21 -05:00
Peter Steinberger
d116bcfb14 refactor(runtime): consolidate followup, gateway, and provider dedupe paths 2026-02-22 14:08:51 +00:00
Peter Steinberger
adace58505 test: reclassify local helper suites out of agents e2e 2026-02-22 10:53:40 +00:00
Harry Cui Kepler
ffa63173e0 refactor(agents): migrate console.warn/error/info to subsystem logger (#22906)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: a806c4cb2700564096ce8980a8d7f839f8a0d388
Co-authored-by: Kepler2024 <166882517+Kepler2024@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-21 17:11:47 -05:00
Val Alexander
b703ea3675 fix: prevent compaction "prompt too long" errors (#22921)
* includes: prompt overhead in compaction safeguard calculation.

Subtracts SUMMARIZATION_OVERHEAD_TOKENS from maxChunkTokens in both the main summarization path and the dropped-messages summarization path.

This ensures the chunk budget leaves room for the prompt overhead that generateSummary wraps around each chunk.

* adds: budget for overhead tokens to use an effectiveMax instead of maxTokens naïvely.

- Added `SUMMARIZATION_OVERHEAD_TOKENS = 4096` — a budget for the tokens that `generateSummary` adds on top of the serialized conversation (system prompt, `<conversation>` tags, summarization instructions, `<previous-summary>` block, and reasoning: "high" thinking budget).
- `chunkMessagesByMaxTokens` now divides `maxTokens` by `SAFETY_MARGIN` (1.2) before comparing against estimated token counts. Previously, the safety margin was only used in `computeAdaptiveChunkRatio` and `isOversizedForSummary` but not in the actual chunking loop — so chunks could be built that fit the estimated budget but exceeded the real budget once the API tokenized them properly.
2026-02-21 14:42:18 -06:00
Peter Steinberger
85ebdf88b0 refactor(agents): share text block extraction helper 2026-02-18 18:25:25 +00:00
Peter Steinberger
31f83c86b2 refactor(test): dedupe agent harnesses and routing fixtures 2026-02-18 04:49:22 +00:00
Peter Steinberger
b8b43175c5 style: align formatting with oxfmt 0.33 2026-02-18 01:34:35 +00:00
Peter Steinberger
31f9be126c style: run oxfmt and fix gate failures 2026-02-18 01:29:02 +00:00
cpojer
688f86bf28 chore: Fix types in tests 43/N. 2026-02-17 15:50:07 +09:00
cpojer
d0cb8c19b2 chore: wtf. 2026-02-17 13:36:48 +09:00
Sebastian
ed11e93cf2 chore(format) 2026-02-16 23:20:16 -05:00
cpojer
90ef2d6bdf chore: Update formatting. 2026-02-17 09:18:40 +09:00
康熙
3296a25cc6 fix: format compaction-safeguard.ts with oxfmt 2026-02-17 00:00:20 +01:00
康熙
c4f829411f feat: append workspace critical rules to compaction summary
- Add readWorkspaceContextForSummary() to extract Session Startup + Red Lines from AGENTS.md
- Inject workspace context into compaction summary (limited to 2000 chars)
- Export extractSections() from post-compaction-context.ts for reuse
- Ensures compaction summary includes core rules needed for recovery

Part 1 of post-compaction context injection feature.
2026-02-17 00:00:20 +01:00
Peter Steinberger
f717a13039 refactor(agent): dedupe harness and command workflows 2026-02-16 14:59:30 +00:00
Peter Steinberger
fddf8a6f4a perf(test): fold pi extensions runtime registry tests into agents suite 2026-02-16 00:22:36 +00:00
Peter Steinberger
9084c4e345 refactor(pi): share session manager runtime registry 2026-02-15 17:39:21 +00:00
Peter Steinberger
60a7625f2a refactor(agents): share glob matcher 2026-02-14 15:39:44 +00:00
Peter Steinberger
9131b22a28 test: migrate suites to e2e coverage layout 2026-02-13 14:28:22 +00:00
Ayaan Zaidi
0992c5a809 fix: cap context window resolution (#6187) (thanks @iamEvanYT) 2026-02-01 19:52:56 +05:30
Evan
5d3c898a94 fix: update compaction safeguard to respect context window tokens 2026-02-01 19:52:56 +05:30
cpojer
f06dd8df06 chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts. 2026-02-01 10:03:47 +09:00
cpojer
5ceff756e1 chore: Enable "curly" rule to avoid single-statement if confusion/errors. 2026-01-31 16:19:20 +09:00
cpojer
15792b153f chore: Enable more lint rules, disable some that trigger a lot. Will clean up later. 2026-01-31 16:04:04 +09:00
jigar
dde9605874 Agents: summarize dropped messages during compaction safeguard pruning (#2418) 2026-01-26 20:35:08 -06:00
Peter Steinberger
ed05152cb1 fix: align compaction summary message types 2026-01-23 23:03:04 +00:00
Peter Steinberger
022aa10063 feat(compaction): apply staged pruning 2026-01-23 22:23:23 +00:00
Dave Lauer
d03c404cb4 feat(compaction): add adaptive chunk sizing, progressive fallback, and UI indicator (#1466)
* fix(ui): allow relative URLs in avatar validation

The isAvatarUrl check only accepted http://, https://, or data: URLs,
but the /avatar/{agentId} endpoint returns relative paths like /avatar/main.
This caused local file avatars to display as text instead of images.

Fixes avatar display for locally configured avatar files.

* fix(gateway): resolve local avatars to URL in HTML injection and RPC

The frontend fix alone wasn't enough because:
1. serveIndexHtml() was injecting the raw avatar filename into HTML
2. agent.identity.get RPC was returning raw filename, overwriting the
   HTML-injected value

Now both paths resolve local file avatars (*.png, *.jpg, etc.) to the
/avatar/{agentId} endpoint URL.

* feat(compaction): add adaptive chunk sizing and progressive fallback

- Add computeAdaptiveChunkRatio() to reduce chunk size for large messages
- Add isOversizedForSummary() to detect messages too large to summarize
- Add summarizeWithFallback() with progressive fallback:
  - Tries full summarization first
  - Falls back to partial summarization excluding oversized messages
  - Notes oversized messages in the summary output
- Add SAFETY_MARGIN (1.2x) buffer for token estimation inaccuracy
- Reduce MIN_CHUNK_RATIO to 0.15 for very large messages

This prevents compaction failures when conversations contain
unusually large tool outputs or responses that exceed the
summarization model's context window.

* feat(ui): add compaction indicator and improve event error handling

Compaction indicator:
- Add CompactionStatus type and handleCompactionEvent() in app-tool-stream.ts
- Show '🧹 Compacting context...' toast while active (with pulse animation)
- Show '🧹 Context compacted' briefly after completion
- Auto-clear toast after 5 seconds
- Add CSS styles for .callout.info, .callout.success, .compaction-indicator

Error handling improvements:
- Wrap onEvent callback in try/catch in gateway.ts to prevent errors
  from breaking the WebSocket message handler
- Wrap handleGatewayEvent in try/catch with console.error logging
  to isolate errors and make them visible in devtools

These changes address UI freezes during heavy agent activity by:
1. Showing users when compaction is happening
2. Preventing uncaught errors from silently breaking the event loop

* fix(control-ui): add agentId to DEFAULT_ASSISTANT_IDENTITY

TypeScript inferred the union type without agentId when falling back to
DEFAULT_ASSISTANT_IDENTITY, causing build errors at control-ui.ts:222-223.
2026-01-23 06:32:30 +00:00
Peter Steinberger
5689d7fb98 refactor: remove transcript sanitize extension 2026-01-23 01:34:33 +00:00
Peter Steinberger
db0235a26a fix: gate transcript sanitization by provider 2026-01-23 00:42:45 +00:00