* fix(mattermost): add WebSocket reconnection with exponential backoff
Fixes#13980
The Mattermost WebSocket monitor had no error handling around the
reconnection loop. When connectOnce() threw (e.g. 'fetch failed' from
network issues), the error propagated through the while loop, causing
the gateway to log 'channel exited' and never restart.
Extract runWithReconnect() utility that:
- Catches thrown errors from connectFn and retries
- Uses exponential backoff (2s→4s→8s→...→60s cap)
- Resets backoff after successful connections
- Stops cleanly on abort signal
- Reports errors and reconnect delays via callbacks
* fix(mattermost): make backoff sleep abort-aware and reject on WS connect failure
* fix(mattermost): clean up abort listener on normal timeout to prevent leak
* fix(mattermost): skip error reporting when abort causes connection rejection
* fix(mattermost): use try/finally for abort listener cleanup in connectOnce
* fix: force-close WebSocket on error to prevent reconnect hang
* fix: use ws.terminate() on abort for reliable teardown during CONNECTING state
* fix(mattermost): use initial retry delay for reconnect backoff
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix(media): recognize MP3 and M4A as voice-compatible audio
Telegram sendVoice supports OGG/Opus, MP3, and M4A, but
isVoiceCompatibleAudio only recognized OGG/Opus formats.
- Add MP3 and M4A extensions and MIME types
- Use explicit MIME set instead of substring matching
- Handle MIME parameters (e.g. 'audio/ogg; codecs=opus')
- Add test coverage for all supported and unsupported formats
* fix: narrow MIME allowlist per review feedback
Remove audio/mp4 and audio/aac from voice MIME types — too broad.
Keep only M4A-specific types (audio/x-m4a, audio/m4a).
Add audio/mp4 and audio/aac as negative test cases.
* fix: align voice compatibility and channel coverage (#15438) (thanks @azade-c)
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix: preserve ${VAR} env var references when writing config back to disk
Fixes#11466
When config is loaded, ${VAR} references are resolved to their plaintext
values. Previously, writeConfigFile would serialize the resolved values,
silently replacing "${ANTHROPIC_API_KEY}" with "sk-ant-api03-..." in the
config file.
Now writeConfigFile reads the current file pre-substitution, and for each
value that matches what a ${VAR} reference would resolve to, restores the
original reference. Values the caller intentionally changed are kept as-is.
This fixes all 50+ writeConfigFile call sites (doctor, configure wizard,
gateway config.set/apply/patch, plugins, hooks, etc.) without requiring
any caller changes.
New files:
- src/config/env-preserve.ts — restoreEnvVarRefs() utility
- src/config/env-preserve.test.ts — 11 unit tests
* fix: remove global config env snapshot race
* docs(changelog): note config env snapshot race fix
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix: enforce Telegram 100-command limit with warning (#5787)
Telegram's setMyCommands API rejects requests with more than 100 commands.
When skills + custom + plugin commands exceed the limit, truncate to 100
and warn the user instead of silently failing on every startup.
* fix: enforce Telegram menu cap + keep hidden commands callable (#15844) (thanks @battman21)
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
* fix: ensure CLI exits after command completion
The CLI process would hang indefinitely after commands like
`openclaw gateway restart` completed successfully. Two root causes:
1. `runCli()` returned without calling `process.exit()` after
`program.parseAsync()` resolved, and Commander.js does not
force-exit the process.
2. `daemon-cli/register.ts` eagerly called `createDefaultDeps()`
which imported all messaging-provider modules, creating persistent
event-loop handles that prevented natural Node exit.
Changes:
- Add `flushAndExit()` helper that drains stdout/stderr before calling
`process.exit()`, preventing truncated piped output in CI/scripts.
- Call `flushAndExit()` after both `tryRouteCli()` and
`program.parseAsync()` resolve.
- Remove unnecessary `void createDefaultDeps()` from daemon-cli
registration — daemon lifecycle commands never use messaging deps.
- Make `serveAcpGateway()` return a promise that resolves on
intentional shutdown (SIGINT/SIGTERM), so `openclaw acp` blocks
`parseAsync` for the bridge lifetime and exits cleanly on signal.
- Handle the returned promise in the standalone main-module entry
point to avoid unhandled rejections.
Fixes#12904
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix: refactor CLI lifecycle and lazy outbound deps (#12906) (thanks @DrCrinkle)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
The Matrix channel previously hardcoded `listMatrixAccountIds` to always
return only `DEFAULT_ACCOUNT_ID`, ignoring any accounts configured in
`channels.matrix.accounts`. This prevented running multiple Matrix bot
accounts simultaneously.
Changes:
- Update `listMatrixAccountIds` to read from `channels.matrix.accounts`
config, falling back to `DEFAULT_ACCOUNT_ID` for legacy single-account
configurations
- Add `resolveMatrixConfigForAccount` to resolve config for a specific
account ID, merging account-specific values with top-level defaults
- Update `resolveMatrixAccount` to use account-specific config when
available
- The multi-account config structure (channels.matrix.accounts) was not
defined in the MatrixConfig type, causing TypeScript to not recognize
the field. Added the accounts field to properly type the multi-account
configuration.
- Add stopSharedClientForAccount() to stop only the specific account's
client instead of all clients when an account shuts down
- Wrap dynamic import in try/finally to prevent startup mutex deadlock
if the import fails
- Pass accountId to resolveSharedMatrixClient(), resolveMatrixAuth(),
and createMatrixClient() to ensure the correct account's credentials
are used for outbound messages
- Add accountId parameter to resolveMediaMaxBytes to check account-specific
config before falling back to top-level config
- Maintain backward compatibility with existing single-account setups
This follows the same pattern already used by the WhatsApp channel for
multi-account support.
Fixes#3165Fixes#3085
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Browser/Security: constrain trace and download output paths to temp roots
* Changelog: remove advisory ID from pre-public security note
* Browser/Security: constrain trace and download output paths to temp roots
* Changelog: remove advisory ID from pre-public security note
* test(bluebubbles): align timeout status expectation to 408
* test(discord): remove unused race-condition counter in threading test
* test(bluebubbles): align timeout status expectation to 408
* feat(gateway): add register and awaitDecision methods to ExecApprovalManager
Separates registration (synchronous) from waiting (async) to allow callers
to confirm registration before the decision is made. Adds grace period for
resolved entries to prevent race conditions.
* feat(gateway): add two-phase response and waitDecision handler for exec approvals
Send immediate 'accepted' response after registration so callers can confirm
the approval ID is valid. Add exec.approval.waitDecision endpoint to wait for
decision on already-registered approvals.
* fix(exec): await approval registration before returning approval-pending
Ensures the approval ID is registered in the gateway before the tool returns.
Uses exec.approval.request with expectFinal:false for registration, then
fire-and-forget exec.approval.waitDecision for the decision phase.
Fixes#2402
* test(gateway): update exec-approval test for two-phase response
Add assertion for immediate 'accepted' response before final decision.
* test(exec): update approval-id test mocks for new two-phase flow
Mock both exec.approval.request (registration) and exec.approval.waitDecision
(decision) calls to match the new internal implementation.
* fix(lint): add cause to errors, use generics instead of type assertions
* fix(exec-approval): guard register() against duplicate IDs
* fix: remove unused timeoutMs param, guard register() against duplicates
* fix(exec-approval): throw on duplicate ID, capture entry in closure
* fix: return error on timeout, remove stale test mock branch
* fix: wrap register() in try/catch, make timeout handling consistent
* fix: update snapshot on timeout, make two-phase response opt-in
* fix: extend grace period to 15s, return 'expired' status
* fix: prevent double-resolve after timeout
* fix: make register() idempotent, capture snapshot before await
* fix(gateway): complete two-phase exec approval wiring
* fix: finalize exec approval race fix (openclaw#3357) thanks @ramin-shirali
* fix(protocol): regenerate exec approval request models (openclaw#3357) thanks @ramin-shirali
* fix(test): remove unused callCount in discord threading test
---------
Co-authored-by: rshirali <rshirali@rshirali-haga.local>
Co-authored-by: rshirali <rshirali@rshirali-haga-1.home>
Co-authored-by: Peter Steinberger <steipete@gmail.com>