Commit Graph

7898 Commits

Author SHA1 Message Date
Peter Steinberger
dac8f5ba3f perf(test): trim fixture and import overhead in hot suites 2026-02-13 23:16:41 +00:00
Brandon Wise
b0728e605d fix(cron): skip relay only for explicit delivery config, not legacy payload
Fixes #15692

The previous fix was too broad — it removed the relay for ALL isolated jobs.
This broke backwards compatibility for jobs without explicit delivery config.

The correct behavior is:
- If job.delivery exists → isolated runner handles it via runSubagentAnnounceFlow
- If only legacy payload.deliver fields → relay to main if requested (original behavior)

This addresses Greptile's review feedback about runIsolatedAgentJob being an
injected dependency that might not call runSubagentAnnounceFlow.

Uses resolveCronDeliveryPlan().source to distinguish between explicit delivery
config and legacy payload-only jobs.
2026-02-14 00:08:56 +01:00
Peter Steinberger
45a2cd55cc fix: harden isolated cron announce delivery fallback (#15739) (thanks @widingmarcus-cyber) 2026-02-13 23:49:10 +01:00
Marcus Widing
ea95e88dd6 fix(cron): prevent duplicate delivery for isolated jobs with announce mode
When an isolated cron job delivers its output via deliverOutboundPayloads
or the subagent announce flow, the finish handler in executeJobCore
unconditionally posts a summary to the main agent session and wakes it
via requestHeartbeatNow. The main agent then generates a second response
that is also delivered to the target channel, resulting in duplicate
messages with different content.

Add a `delivered` flag to RunCronAgentTurnResult that is set to true
when the isolated run successfully delivers its output. In executeJobCore,
skip the enqueueSystemEvent + requestHeartbeatNow call when the flag is
set, preventing the main agent from waking up and double-posting.

Fixes #15692
2026-02-13 23:49:10 +01:00
nabbilkhan
207e2c5aff fix: add outbound delivery crash recovery (#15636) (thanks @nabbilkhan) (#15636)
Co-authored-by: Shadow <hi@shadowing.dev>
2026-02-13 15:54:07 -06:00
Peter Steinberger
caebe70e9a perf(test): cut setup/import overhead in hot suites 2026-02-13 21:23:50 +00:00
Joseph Krug
4e9f933e88 fix: reset stale execution state after SIGUSR1 in-process restart (#15195)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 676f9ec45135be0d3471bb0444bc2ac7ce7d5224
Co-authored-by: joeykrug <5925937+joeykrug@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-13 15:30:09 -05:00
Peter Steinberger
2086cdfb9b perf(test): reduce hot-suite import and setup overhead 2026-02-13 20:26:39 +00:00
Peter Steinberger
1655df7ac0 fix(config): log config overwrite audits 2026-02-13 20:12:41 +00:00
Peter Steinberger
6442512954 perf: reduce hotspot test startup and timeout costs 2026-02-13 20:03:01 +00:00
Marcus Castro
31537c669a fix: archive old transcript files on /new and /reset (#14949)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 4724df7dea247970b909ef8d293ba4a612b7b1b4
Co-authored-by: mcaxtr <7562095+mcaxtr@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-13 14:55:16 -05:00
Peter Steinberger
c8b198ab51 perf: speed up gateway missing-tick e2e watchdog 2026-02-13 19:52:45 +00:00
Peter Steinberger
e746a67cc3 perf: speed up telegram media e2e flush timing 2026-02-13 19:52:45 +00:00
大猫子
f24d70ec8e fix(providers): switch MiniMax API-key provider to anthropic-messages (#15297)
Merged via /review-pr -> /prepare-pr -> /merge-pr.

Prepared head SHA: 0e7f84a2a103135221b73e2c3f300790206fc6f4
Co-authored-by: lailoo <20536249+lailoo@users.noreply.github.com>
Co-authored-by: gumadeiras <5599352+gumadeiras@users.noreply.github.com>
Reviewed-by: @gumadeiras
2026-02-13 14:44:36 -05:00
Marcus Castro
4225206f0c fix(gateway): normalize session key casing to prevent ghost sessions (#12846)
* fix(gateway): normalize session key casing to prevent ghost sessions on Linux

On case-sensitive filesystems (Linux), mixed-case session keys like
agent:ops:MySession and agent:ops:mysession resolve to different store
entries, creating ghost duplicates that never converge.

Core changes in session-utils.ts:
- resolveSessionStoreKey: lowercase all session key components
- canonicalizeSpawnedByForAgent: accept cfg, resolve main-alias references
  via canonicalizeMainSessionAlias after lowercasing
- loadSessionEntry: return legacyKey only when it differs from canonicalKey
- resolveGatewaySessionStoreTarget: scan store for case-insensitive matches;
  add optional scanLegacyKeys param to skip disk reads for read-only callers
- Export findStoreKeysIgnoreCase for use by write-path consumers
- Compare global/unknown sentinels case-insensitively in all canonicalization
  functions

sessions-resolve.ts:
- Make resolveSessionKeyFromResolveParams async for inline migration
- Check canonical key first (fast path), then fall back to legacy scan
- Delete ALL legacy case-variant keys in a single updateSessionStore pass

Fixes #12603

* fix(gateway): propagate canonical keys and clean up all case variants on write paths

- agent.ts: use canonicalizeSpawnedByForAgent (with cfg) instead of raw
  toLowerCase; use findStoreKeysIgnoreCase to delete all legacy variants
  on store write; pass canonicalKey to addChatRun, registerAgentRunContext,
  resolveSendPolicy, and agentCommand
- sessions.ts: replace single-key migration with full case-variant cleanup
  via findStoreKeysIgnoreCase in patch/reset/delete/compact handlers; add
  case-insensitive fallback in preview (store already loaded); make
  sessions.resolve handler async; pass scanLegacyKeys: false in preview
- server-node-events.ts: use findStoreKeysIgnoreCase to clean all legacy
  variants on voice.transcript and agent.request write paths; pass
  canonicalKey to addChatRun and agentCommand

* test(gateway): add session key case-normalization tests

Cover the case-insensitive session key canonicalization logic:
- resolveSessionStoreKey normalizes mixed-case bare and prefixed keys
- resolveSessionStoreKey resolves mixed-case main aliases (MAIN, Main)
- resolveGatewaySessionStoreTarget includes legacy mixed-case store keys
- resolveGatewaySessionStoreTarget collects all case-variant duplicates
- resolveGatewaySessionStoreTarget finds legacy main alias keys with
  customized mainKey configuration

All 5 tests fail before the production changes, pass after.

* fix: clean legacy session alias cleanup gaps (openclaw#12846) thanks @mcaxtr

---------

Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 20:42:24 +01:00
Peter Steinberger
f02247b6c5 fix(ci): fix discord proxy websocket binding and bluebubbles timeout status 2026-02-13 19:35:55 +00:00
rodbland2021
d3b2135f86 fix(agents): wait for agent idle before flushing pending tool results (#13746)
* fix(agents): wait for agent idle before flushing pending tool results

When pi-agent-core's auto-retry mechanism handles overloaded/rate-limit
errors, it resolves waitForRetry() on assistant message receipt — before
tool execution completes in the retried agent loop. This causes the
attempt's finally block to call flushPendingToolResults() while tools
are still executing, inserting synthetic 'missing tool result' errors
and causing silent agent failures.

The fix adds a waitForIdle() call before the flush to ensure the agent's
retry loop (including tool execution) has fully completed.

Evidence from real session: tool call and synthetic error were only 53ms
apart — the tool never had a chance to execute before being flushed.

Root cause is in pi-agent-core's _resolveRetry() firing on message_end
instead of agent_end, but this workaround in OpenClaw prevents the
symptom without requiring an upstream fix.

Fixes #8643
Fixes #13351
Refs #6682, #12595

* test: add tests for tool result flush race condition

Validates that:
- Real tool results are not replaced by synthetic errors when they arrive in time
- Flush correctly inserts synthetic errors for genuinely orphaned tool calls
- Flush is a no-op after real tool results have already been received

Refs #8643, #13748

* fix(agents): add waitForIdle to all flushPendingToolResults call sites

The original fix only covered the main run finally block, but there are
two additional call sites that can trigger flushPendingToolResults while
tools are still executing:

1. The catch block in attempt.ts (session setup error handler)
2. The finally block in compact.ts (compaction teardown)

Both now await agent.waitForIdle() with a 30s timeout before flushing,
matching the pattern already applied to the main finally block.

Production testing on VPS with debug logging confirmed these additional
paths can fire during sub-agent runs, producing spurious synthetic
'missing tool result' errors.

* fix(agents): centralize idle-wait flush and clear timeout handle

---------

Co-authored-by: Renue Development <dev@renuebyscience.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 20:35:43 +01:00
Shadow
4b3c87b82d fix: finalize discord presence config (#10855) (thanks @h0tp-ftw) 2026-02-13 13:34:19 -06:00
Shadow
6acea69b20 Discord: refine presence config defaults (#10855) (thanks @h0tp-ftw) 2026-02-13 13:34:19 -06:00
h0tp
770e904c21 fix(discord): restrict activity types and statuses to valid enum values
- Removed 'offline' from valid config statuses (use 'invisible').
- Restricted activityType to 0, 1, 2, 3, 5 (excluding custom/4).
- Added logic to only send 'url' when activityType is 1 (Streaming).
- Updated Typescript definitions and Zod schemas to match.
2026-02-13 13:34:19 -06:00
h0tp
5d8c6ef91c feat(discord): add configurable presence (activity/status/type)
- Adds `activity`, `status`, `activityType`, and `activityUrl` to Discord provider config schema.
- Implements a `ReadyListener` in `DiscordProvider` to apply these settings on connection.
- Solves the issue where `@buape/carbon` ignores initial presence options in constructor.
- Validated manually and via existing test suite.
2026-02-13 13:34:19 -06:00
Peter Steinberger
c801ffdf99 perf: add zero-delay gateway client connect for tests 2026-02-13 19:32:16 +00:00
Shadow
5645f227f6 Discord: add gateway proxy docs and tests (#10400) (thanks @winter-loo) 2026-02-13 13:26:51 -06:00
ludd50155
e55431bf84 fix(discord): restore gateway reconnect maxAttempts to 50 2026-02-13 13:26:51 -06:00
ludd50155
5f0debdfb2 Fix: check cleanups 2026-02-13 13:26:51 -06:00
ludd50155
0cb69b0f28 Discord: add gateway proxy support
Conflicts:
	package.json
	pnpm-lock.yaml
	src/config/schema.ts
	src/discord/monitor/provider.ts
2026-02-13 13:26:51 -06:00
Mariano
7f0489e473 Security/Browser: constrain trace and download output paths to OpenClaw temp roots (#15652)
* Browser/Security: constrain trace and download output paths to temp roots

* Changelog: remove advisory ID from pre-public security note

* Browser/Security: constrain trace and download output paths to temp roots

* Changelog: remove advisory ID from pre-public security note

* test(bluebubbles): align timeout status expectation to 408

* test(discord): remove unused race-condition counter in threading test

* test(bluebubbles): align timeout status expectation to 408
2026-02-13 19:24:33 +00:00
Peter Steinberger
08725270e2 perf: honor low timeout budgets in health telegram probes 2026-02-13 19:22:25 +00:00
Peter Steinberger
7d1be585de test: fix exec approval and pty fallback e2e flows 2026-02-13 19:19:15 +00:00
Peter Steinberger
34eb14d24f perf: trim web auto-reply test cleanup backoff 2026-02-13 19:19:11 +00:00
Peter Steinberger
1c7a099b6d test: move reasoning replay regression to unit suite 2026-02-13 19:09:41 +00:00
Peter Steinberger
4c401d336d refactor(memory): extract manager sync and embedding ops 2026-02-13 19:08:37 +00:00
Peter Steinberger
b47fa9e715 refactor(exec): extract bash tool runtime internals 2026-02-13 19:08:37 +00:00
Peter Steinberger
3f5e72835e refactor(tts): extract directives and provider core 2026-02-13 19:08:37 +00:00
Peter Steinberger
83bc73f4ea refactor(exec-approvals): split allowlist evaluation module 2026-02-13 19:08:37 +00:00
Peter Steinberger
81fbfa06ee refactor(exec-approvals): extract command analysis module 2026-02-13 19:08:37 +00:00
Peter Steinberger
2a1f8b2615 refactor(media): extract runner entry execution helpers 2026-02-13 19:08:37 +00:00
Peter Steinberger
1d46d3ae4e refactor(node-host): extract invoke handlers 2026-02-13 19:08:37 +00:00
Peter Steinberger
02684b913b refactor(cli): split update command modules 2026-02-13 19:08:37 +00:00
Peter Steinberger
39af215c31 refactor(outbound): extract message action param helpers 2026-02-13 19:08:37 +00:00
Peter Steinberger
23555de5d9 refactor(security): extract channel audit checks 2026-02-13 19:08:37 +00:00
Peter Steinberger
ca3a42009c refactor(memory): extract qmd scope helpers 2026-02-13 19:08:37 +00:00
Peter Steinberger
c256503ea1 refactor(infra): extract session cost usage types 2026-02-13 19:08:37 +00:00
Peter Steinberger
5a431f57fc refactor(infra): split heartbeat event filters 2026-02-13 19:08:37 +00:00
Peter Steinberger
a79c2de956 refactor(gateway): extract ws auth message helpers 2026-02-13 19:08:37 +00:00
Peter Steinberger
5429f2e635 refactor(line): split flex template builders 2026-02-13 19:08:37 +00:00
Shadow
71939523a0 fix: normalize Discord autoThread reply target (#8302) (thanks @gavinbmoore) 2026-02-13 13:04:55 -06:00
Claw
e65b649993 fix(discord): ensure autoThread replies route to existing threads
Fixes #8278

When autoThread is enabled and a thread already exists (user continues
conversation in thread), replies were sometimes routing to the root
channel instead of the thread. This happened because the reply delivery
plan only explicitly set the thread target when a NEW thread was created
(createdThreadId), but not when the message was in an existing thread.

The fix adds a fallback case: when threadChannel is set (we're in an
existing thread) but no new thread was created, explicitly route to
the thread's channel ID. This ensures all thread replies go to the
correct destination.
2026-02-13 13:04:55 -06:00
Ramin Shirali Hossein Zade
1af0edf7ff fix: ensure exec approval is registered before returning (#2402) (#3357)
* feat(gateway): add register and awaitDecision methods to ExecApprovalManager

Separates registration (synchronous) from waiting (async) to allow callers
to confirm registration before the decision is made. Adds grace period for
resolved entries to prevent race conditions.

* feat(gateway): add two-phase response and waitDecision handler for exec approvals

Send immediate 'accepted' response after registration so callers can confirm
the approval ID is valid. Add exec.approval.waitDecision endpoint to wait for
decision on already-registered approvals.

* fix(exec): await approval registration before returning approval-pending

Ensures the approval ID is registered in the gateway before the tool returns.
Uses exec.approval.request with expectFinal:false for registration, then
fire-and-forget exec.approval.waitDecision for the decision phase.

Fixes #2402

* test(gateway): update exec-approval test for two-phase response

Add assertion for immediate 'accepted' response before final decision.

* test(exec): update approval-id test mocks for new two-phase flow

Mock both exec.approval.request (registration) and exec.approval.waitDecision
(decision) calls to match the new internal implementation.

* fix(lint): add cause to errors, use generics instead of type assertions

* fix(exec-approval): guard register() against duplicate IDs

* fix: remove unused timeoutMs param, guard register() against duplicates

* fix(exec-approval): throw on duplicate ID, capture entry in closure

* fix: return error on timeout, remove stale test mock branch

* fix: wrap register() in try/catch, make timeout handling consistent

* fix: update snapshot on timeout, make two-phase response opt-in

* fix: extend grace period to 15s, return 'expired' status

* fix: prevent double-resolve after timeout

* fix: make register() idempotent, capture snapshot before await

* fix(gateway): complete two-phase exec approval wiring

* fix: finalize exec approval race fix (openclaw#3357) thanks @ramin-shirali

* fix(protocol): regenerate exec approval request models (openclaw#3357) thanks @ramin-shirali

* fix(test): remove unused callCount in discord threading test

---------

Co-authored-by: rshirali <rshirali@rshirali-haga.local>
Co-authored-by: rshirali <rshirali@rshirali-haga-1.home>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-02-13 19:57:02 +01:00
Shadow
c87e481ec9 Discord: fix voice duration error handling 2026-02-13 12:44:14 -06:00