* fix: cancel compaction instead of truncating history when summarization fails
When the compaction safeguard cannot generate a summary (no model, no API
key, or LLM error), it previously returned a "Summary unavailable" fallback
string and still truncated history. This caused irreversible data loss -
older messages were discarded even though no meaningful summary was produced.
Now returns `{ cancel: true }` in all three failure paths so the framework
aborts compaction entirely and preserves the full conversation history.
Fixes#10332
Co-authored-by: Cursor <cursoragent@cursor.com>
* fix: use deterministic timestamps in compaction safeguard tests
Replace Date.now() with fixed timestamp (0) in test data to prevent
nondeterministic behavior in snapshot-based or order-dependent tests.
Co-authored-by: Cursor <cursoragent@cursor.com>
* Changelog: note compaction cancellation safeguard fix
---------
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* fix(compaction): pass model through runtime to fix ctx.model undefined
Fixes#3479
Root cause: extensionRunner.initialize() is never called in compact.ts workflow,
leaving ctx.model undefined. Compaction safeguard checks ctx.model and returns
fallback summary immediately without attempting LLM summarization.
Changes:
1. Pass model through compaction safeguard runtime registry (same pattern as maxHistoryShare)
2. Fall back to runtime.model when ctx.model is undefined
3. Add once-per-session warning when both models are missing (prevents log spam)
4. Add regression test for runtime.model fallback
This follows the established runtime registry pattern rather than attempting to call
extensionRunner.initialize() (which is SDK-internal and not meant for direct access).
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* test: add comprehensive tests for compaction-safeguard model fallback
Add integration tests to verify the model fallback behavior:
- Test runtime.model fallback when ctx.model is undefined (compact.ts workflow)
- Test fallback summary when both ctx.model and runtime.model are undefined
- Test contextWindowTokens runtime storage/retrieval
- Test combined runtime values (maxHistoryShare + contextWindowTokens + model)
These tests verify the fix for issue #3479 where compaction fails due to
ctx.model being undefined in the compact.ts workflow. The runtime registry
pattern allows model to be passed when extensionRunner.initialize() is not
called, ensuring summarization works in all code paths.
Related: PR #17864
* fix(test): adapt compaction-safeguard tests to upstream type changes
- Add baseUrl to Model mock objects (now required by Model<Api>)
- Add explicit Model<Api> annotation to prevent provider string widening
- Cast modelRegistry mock through unknown (ModelRegistry expanded)
- Use non-null assertion for compactionHandler (TypeScript strict)
- Type compaction result explicitly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Compaction: add changelog credit for model fallback fix
* Update CHANGELOG.md
* Update CHANGELOG.md
---------
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-authored-by: Vincent Koc <vincentkoc@ieee.org>
* includes: prompt overhead in compaction safeguard calculation.
Subtracts SUMMARIZATION_OVERHEAD_TOKENS from maxChunkTokens in both the main summarization path and the dropped-messages summarization path.
This ensures the chunk budget leaves room for the prompt overhead that generateSummary wraps around each chunk.
* adds: budget for overhead tokens to use an effectiveMax instead of maxTokens naïvely.
- Added `SUMMARIZATION_OVERHEAD_TOKENS = 4096` — a budget for the tokens that `generateSummary` adds on top of the serialized conversation (system prompt, `<conversation>` tags, summarization instructions, `<previous-summary>` block, and reasoning: "high" thinking budget).
- `chunkMessagesByMaxTokens` now divides `maxTokens` by `SAFETY_MARGIN` (1.2) before comparing against estimated token counts. Previously, the safety margin was only used in `computeAdaptiveChunkRatio` and `isOversizedForSummary` but not in the actual chunking loop — so chunks could be built that fit the estimated budget but exceeded the real budget once the API tokenized them properly.
- Add readWorkspaceContextForSummary() to extract Session Startup + Red Lines from AGENTS.md
- Inject workspace context into compaction summary (limited to 2000 chars)
- Export extractSections() from post-compaction-context.ts for reuse
- Ensures compaction summary includes core rules needed for recovery
Part 1 of post-compaction context injection feature.
* fix(ui): allow relative URLs in avatar validation
The isAvatarUrl check only accepted http://, https://, or data: URLs,
but the /avatar/{agentId} endpoint returns relative paths like /avatar/main.
This caused local file avatars to display as text instead of images.
Fixes avatar display for locally configured avatar files.
* fix(gateway): resolve local avatars to URL in HTML injection and RPC
The frontend fix alone wasn't enough because:
1. serveIndexHtml() was injecting the raw avatar filename into HTML
2. agent.identity.get RPC was returning raw filename, overwriting the
HTML-injected value
Now both paths resolve local file avatars (*.png, *.jpg, etc.) to the
/avatar/{agentId} endpoint URL.
* feat(compaction): add adaptive chunk sizing and progressive fallback
- Add computeAdaptiveChunkRatio() to reduce chunk size for large messages
- Add isOversizedForSummary() to detect messages too large to summarize
- Add summarizeWithFallback() with progressive fallback:
- Tries full summarization first
- Falls back to partial summarization excluding oversized messages
- Notes oversized messages in the summary output
- Add SAFETY_MARGIN (1.2x) buffer for token estimation inaccuracy
- Reduce MIN_CHUNK_RATIO to 0.15 for very large messages
This prevents compaction failures when conversations contain
unusually large tool outputs or responses that exceed the
summarization model's context window.
* feat(ui): add compaction indicator and improve event error handling
Compaction indicator:
- Add CompactionStatus type and handleCompactionEvent() in app-tool-stream.ts
- Show '🧹 Compacting context...' toast while active (with pulse animation)
- Show '🧹 Context compacted' briefly after completion
- Auto-clear toast after 5 seconds
- Add CSS styles for .callout.info, .callout.success, .compaction-indicator
Error handling improvements:
- Wrap onEvent callback in try/catch in gateway.ts to prevent errors
from breaking the WebSocket message handler
- Wrap handleGatewayEvent in try/catch with console.error logging
to isolate errors and make them visible in devtools
These changes address UI freezes during heavy agent activity by:
1. Showing users when compaction is happening
2. Preventing uncaught errors from silently breaking the event loop
* fix(control-ui): add agentId to DEFAULT_ASSISTANT_IDENTITY
TypeScript inferred the union type without agentId when falling back to
DEFAULT_ASSISTANT_IDENTITY, causing build errors at control-ui.ts:222-223.