Files
openclaw/docs/gateway/doctor.md

331 lines
15 KiB
Markdown
Raw Permalink Normal View History

---
summary: "Doctor command: health checks, config migrations, and repair steps"
read_when:
- Adding or modifying doctor migrations
- Introducing breaking config changes
title: "Doctor"
---
2026-01-31 21:13:13 +09:00
# Doctor
2026-01-30 03:15:10 +01:00
`openclaw doctor` is the repair + migration tool for OpenClaw. It fixes stale
config/state, checks health, and provides actionable repair steps.
2026-01-06 18:25:52 +00:00
## Quick start
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor
```
### Headless / automation
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor --yes
```
Accept defaults without prompting (including restart/service/sandbox repair steps when applicable).
2026-01-08 21:47:35 +01:00
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor --repair
2026-01-08 21:47:35 +01:00
```
Apply recommended repairs without prompting (repairs + restarts where safe).
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor --repair --force
2026-01-08 21:47:35 +01:00
```
Apply aggressive repairs too (overwrites custom supervisor configs).
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor --non-interactive
```
Run without prompts and only apply safe migrations (config normalization + on-disk state moves). Skips restart/service/sandbox actions that require human confirmation.
Legacy state migrations run automatically when detected.
2026-01-07 22:31:08 +01:00
```bash
2026-01-30 03:15:10 +01:00
openclaw doctor --deep
2026-01-07 22:31:08 +01:00
```
Scan system services for extra gateway installs (launchd/systemd/schtasks).
If you want to review changes before writing, open the config file first:
```bash
2026-01-30 03:15:10 +01:00
cat ~/.openclaw/openclaw.json
```
## What it does (summary)
2026-01-31 21:13:13 +09:00
2026-01-11 03:17:06 +01:00
- Optional pre-flight update for git installs (interactive only).
- UI protocol freshness check (rebuilds Control UI when the protocol schema is newer).
- Health check + restart prompt.
- Skills status summary (eligible/missing/blocked).
2026-01-15 06:18:34 +00:00
- Config normalization for legacy values.
- OpenCode provider override warnings (`models.providers.opencode` / `models.providers.opencode-go`).
- Legacy on-disk state migration (sessions/agent dir/WhatsApp auth).
- Legacy cron store migration (`jobId`, `schedule.cron`, top-level delivery/payload fields, payload `provider`, simple `notify: true` webhook fallback jobs).
- State integrity and permissions checks (sessions, transcripts, state dir).
- Config file permission checks (chmod 600) when running locally.
- Model auth health: checks OAuth expiry, can refresh expiring tokens, and reports auth-profile cooldown/disabled states.
2026-01-30 03:15:10 +01:00
- Extra workspace dir detection (`~/openclaw`).
- Sandbox image repair when sandboxing is enabled.
- Legacy service migration and extra gateway detection.
2026-01-08 02:28:21 +01:00
- Gateway runtime checks (service installed but not running; cached launchd label).
2026-01-13 08:25:22 +00:00
- Channel status warnings (probed from the running gateway).
- Supervisor config audit (launchd/systemd/schtasks) with optional repair.
- Gateway runtime best-practice checks (Node vs Bun, version-manager paths).
2026-01-08 02:28:21 +01:00
- Gateway port collision diagnostics (default `18789`).
- Security warnings for open DM policies.
- Gateway auth checks for local token mode (offers token generation when no token source exists; does not overwrite token SecretRef configs).
- systemd linger check on Linux.
2026-01-13 07:25:25 +00:00
- Source install checks (pnpm workspace mismatch, missing UI assets, missing tsx binary).
- Writes updated config + wizard metadata.
## Detailed behavior and rationale
2026-01-11 03:17:06 +01:00
### 0) Optional update (git installs)
2026-01-31 21:13:13 +09:00
2026-01-11 03:17:06 +01:00
If this is a git checkout and doctor is running interactively, it offers to
update (fetch/rebase/build) before running doctor.
2026-01-15 06:18:34 +00:00
### 1) Config normalization
2026-01-31 21:13:13 +09:00
2026-01-15 06:18:34 +00:00
If the config contains legacy value shapes (for example `messages.ackReaction`
without a channel-specific override), doctor normalizes them into the current
schema.
### 2) Legacy config key migrations
2026-01-31 21:13:13 +09:00
When the config contains deprecated keys, other commands refuse to run and ask
2026-01-30 03:15:10 +01:00
you to run `openclaw doctor`.
Doctor will:
2026-01-31 21:13:13 +09:00
- Explain which legacy keys were found.
- Show the migration it applied.
2026-01-30 03:15:10 +01:00
- Rewrite `~/.openclaw/openclaw.json` with the updated schema.
The Gateway also auto-runs doctor migrations on startup when it detects a
legacy config format, so stale configs are repaired without manual intervention.
Current migrations:
2026-01-31 21:13:13 +09:00
- `routing.allowFrom``channels.whatsapp.allowFrom`
- `routing.groupChat.requireMention``channels.whatsapp/telegram/imessage.groups."*".requireMention`
- `routing.groupChat.historyLimit``messages.groupChat.historyLimit`
- `routing.groupChat.mentionPatterns``messages.groupChat.mentionPatterns`
- `routing.queue``messages.queue`
- `routing.bindings` → top-level `bindings`
- `routing.agents`/`routing.defaultAgentId``agents.list` + `agents.list[].default`
- `routing.agentToAgent``tools.agentToAgent`
- `routing.transcribeAudio``tools.media.audio.models`
- `bindings[].match.accountID``bindings[].match.accountId`
- For channels with named `accounts` but missing `accounts.default`, move account-scoped top-level single-account channel values into `channels.<channel>.accounts.default` when present
- `identity``agents.list[].identity`
- `agent.*``agents.defaults` + `tools.*` (tools/elevated/exec/sandbox/subagents)
- `agent.model`/`allowedModels`/`modelAliases`/`modelFallbacks`/`imageModelFallbacks`
`agents.defaults.models` + `agents.defaults.model.primary/fallbacks` + `agents.defaults.imageModel.primary/fallbacks`
- `browser.ssrfPolicy.allowPrivateNetwork``browser.ssrfPolicy.dangerouslyAllowPrivateNetwork`
Doctor warnings also include account-default guidance for multi-account channels:
- If two or more `channels.<channel>.accounts` entries are configured without `channels.<channel>.defaultAccount` or `accounts.default`, doctor warns that fallback routing can pick an unexpected account.
- If `channels.<channel>.defaultAccount` is set to an unknown account ID, doctor warns and lists configured account IDs.
### 2b) OpenCode provider overrides
2026-01-31 21:13:13 +09:00
If youve added `models.providers.opencode`, `opencode-zen`, or `opencode-go`
manually, it overrides the built-in OpenCode catalog from `@mariozechner/pi-ai`.
That can force models onto the wrong API or zero out costs. Doctor warns so you
can remove the override and restore per-model API routing + costs.
### 3) Legacy state migrations (disk layout)
2026-01-31 21:13:13 +09:00
Doctor can migrate older on-disk layouts into the current structure:
2026-01-31 21:13:13 +09:00
- Sessions store + transcripts:
2026-01-30 03:15:10 +01:00
- from `~/.openclaw/sessions/` to `~/.openclaw/agents/<agentId>/sessions/`
- Agent dir:
2026-01-30 03:15:10 +01:00
- from `~/.openclaw/agent/` to `~/.openclaw/agents/<agentId>/agent/`
- WhatsApp auth state (Baileys):
2026-01-30 03:15:10 +01:00
- from legacy `~/.openclaw/credentials/*.json` (except `oauth.json`)
- to `~/.openclaw/credentials/whatsapp/<accountId>/...` (default account id: `default`)
These migrations are best-effort and idempotent; doctor will emit warnings when
it leaves any legacy folders behind as backups. The Gateway/CLI also auto-migrates
the legacy sessions + agent dir on startup so history/auth/models land in the
per-agent path without a manual doctor run. WhatsApp auth is intentionally only
2026-01-30 03:15:10 +01:00
migrated via `openclaw doctor`.
### 3b) Legacy cron store migrations
Doctor also checks the cron job store (`~/.openclaw/cron/jobs.json` by default,
or `cron.store` when overridden) for old job shapes that the scheduler still
accepts for compatibility.
Current cron cleanups include:
- `jobId``id`
- `schedule.cron``schedule.expr`
- top-level payload fields (`message`, `model`, `thinking`, ...) → `payload`
- top-level delivery fields (`deliver`, `channel`, `to`, `provider`, ...) → `delivery`
- payload `provider` delivery aliases → explicit `delivery.channel`
- simple legacy `notify: true` webhook fallback jobs → explicit `delivery.mode="webhook"` with `delivery.to=cron.webhook`
Doctor only auto-migrates `notify: true` jobs when it can do so without
changing behavior. If a job combines legacy notify fallback with an existing
non-webhook delivery mode, doctor warns and leaves that job for manual review.
### 4) State integrity checks (session persistence, routing, and safety)
2026-01-31 21:13:13 +09:00
The state directory is the operational brainstem. If it vanishes, you lose
sessions, credentials, logs, and config (unless you have backups elsewhere).
Doctor checks:
2026-01-31 21:13:13 +09:00
- **State dir missing**: warns about catastrophic state loss, prompts to recreate
the directory, and reminds you that it cannot recover missing data.
- **State dir permissions**: verifies writability; offers to repair permissions
(and emits a `chown` hint when owner/group mismatch is detected).
- **macOS cloud-synced state dir**: warns when state resolves under iCloud Drive
(`~/Library/Mobile Documents/com~apple~CloudDocs/...`) or
`~/Library/CloudStorage/...` because sync-backed paths can cause slower I/O
and lock/sync races.
- **Linux SD or eMMC state dir**: warns when state resolves to an `mmcblk*`
mount source, because SD or eMMC-backed random I/O can be slower and wear
faster under session and credential writes.
- **Session dirs missing**: `sessions/` and the session store directory are
required to persist history and avoid `ENOENT` crashes.
- **Transcript mismatch**: warns when recent session entries have missing
transcript files.
- **Main session “1-line JSONL”**: flags when the main transcript has only one
line (history is not accumulating).
2026-01-30 03:15:10 +01:00
- **Multiple state dirs**: warns when multiple `~/.openclaw` folders exist across
home directories or when `OPENCLAW_STATE_DIR` points elsewhere (history can
split between installs).
- **Remote mode reminder**: if `gateway.mode=remote`, doctor reminds you to run
it on the remote host (the state lives there).
2026-01-30 03:15:10 +01:00
- **Config file permissions**: warns if `~/.openclaw/openclaw.json` is
group/world readable and offers to tighten to `600`.
2026-01-09 00:32:48 +00:00
### 5) Model auth health (OAuth expiry)
2026-01-31 21:13:13 +09:00
2026-01-09 00:32:48 +00:00
Doctor inspects OAuth profiles in the auth store, warns when tokens are
expiring/expired, and can refresh them when safe. If the Anthropic Claude Code
2026-01-16 02:53:33 +00:00
profile is stale, it suggests running `claude setup-token` (or pasting a setup-token).
2026-01-09 00:32:48 +00:00
Refresh prompts only appear when running interactively (TTY); `--non-interactive`
skips refresh attempts.
Doctor also reports auth profiles that are temporarily unusable due to:
2026-01-31 21:13:13 +09:00
- short cooldowns (rate limits/timeouts/auth failures)
- longer disables (billing/credit failures)
2026-01-11 03:17:06 +01:00
### 6) Hooks model validation
2026-01-31 21:13:13 +09:00
2026-01-11 03:17:06 +01:00
If `hooks.gmail.model` is set, doctor validates the model reference against the
catalog and allowlist and warns when it wont resolve or is disallowed.
### 7) Sandbox image repair
2026-01-31 21:13:13 +09:00
When sandboxing is enabled, doctor checks Docker images and offers to build or
switch to legacy names if the current image is missing.
2026-01-11 03:17:06 +01:00
### 8) Gateway service migrations and cleanup hints
2026-01-31 21:13:13 +09:00
2026-01-15 06:18:34 +00:00
Doctor detects legacy gateway services (launchd/systemd/schtasks) and
2026-01-30 03:15:10 +01:00
offers to remove them and install the OpenClaw service using the current gateway
port. It can also scan for extra gateway-like services and print cleanup hints.
2026-01-30 03:15:10 +01:00
Profile-named OpenClaw gateway services are considered first-class and are not
flagged as "extra."
2026-01-11 03:17:06 +01:00
### 9) Security warnings
2026-01-31 21:13:13 +09:00
Doctor emits warnings when a provider is open to DMs without an allowlist, or
when a policy is configured in a dangerous way.
2026-01-11 03:17:06 +01:00
### 10) systemd linger (Linux)
2026-01-31 21:13:13 +09:00
If running as a systemd user service, doctor ensures lingering is enabled so the
gateway stays alive after logout.
2026-01-11 03:17:06 +01:00
### 11) Skills status
2026-01-31 21:13:13 +09:00
Doctor prints a quick summary of eligible/missing/blocked skills for the current
workspace.
2026-01-11 03:17:06 +01:00
### 12) Gateway auth checks (local token)
2026-01-31 21:13:13 +09:00
Doctor checks local gateway token auth readiness.
- If token mode needs a token and no token source exists, doctor offers to generate one.
- If `gateway.auth.token` is SecretRef-managed but unavailable, doctor warns and does not overwrite it with plaintext.
- `openclaw doctor --generate-gateway-token` forces generation only when no token SecretRef is configured.
### 12b) Read-only SecretRef-aware repairs
Some repair flows need to inspect configured credentials without weakening runtime fail-fast behavior.
- `openclaw doctor --fix` now uses the same read-only SecretRef summary model as status-family commands for targeted config repairs.
- Example: Telegram `allowFrom` / `groupAllowFrom` `@username` repair tries to use configured bot credentials when available.
- If the Telegram bot token is configured via SecretRef but unavailable in the current command path, doctor reports that the credential is configured-but-unavailable and skips auto-resolution instead of crashing or misreporting the token as missing.
2026-01-11 03:17:06 +01:00
### 13) Gateway health check + restart
2026-01-31 21:13:13 +09:00
Doctor runs a health check and offers to restart the gateway when it looks
unhealthy.
2026-01-13 08:25:22 +00:00
### 14) Channel status warnings
2026-01-31 21:13:13 +09:00
2026-01-13 08:25:22 +00:00
If the gateway is healthy, doctor runs a channel status probe and reports
2026-01-11 03:17:06 +01:00
warnings with suggested fixes.
### 15) Supervisor config audit + repair
2026-01-31 21:13:13 +09:00
Doctor checks the installed supervisor config (launchd/systemd/schtasks) for
missing or outdated defaults (e.g., systemd network-online dependencies and
restart delay). When it finds a mismatch, it recommends an update and can
rewrite the service file/task to the current defaults.
Notes:
2026-01-31 21:13:13 +09:00
2026-01-30 03:15:10 +01:00
- `openclaw doctor` prompts before rewriting supervisor config.
- `openclaw doctor --yes` accepts the default repair prompts.
- `openclaw doctor --repair` applies recommended fixes without prompts.
- `openclaw doctor --repair --force` overwrites custom supervisor configs.
- If token auth requires a token and `gateway.auth.token` is SecretRef-managed, doctor service install/repair validates the SecretRef but does not persist resolved plaintext token values into supervisor service environment metadata.
- If token auth requires a token and the configured token SecretRef is unresolved, doctor blocks the install/repair path with actionable guidance.
- If both `gateway.auth.token` and `gateway.auth.password` are configured and `gateway.auth.mode` is unset, doctor blocks install/repair until mode is set explicitly.
- For Linux user-systemd units, doctor token drift checks now include both `Environment=` and `EnvironmentFile=` sources when comparing service auth metadata.
2026-01-30 03:15:10 +01:00
- You can always force a full rewrite via `openclaw gateway install --force`.
2026-01-11 03:17:06 +01:00
### 16) Gateway runtime + port diagnostics
2026-01-31 21:13:13 +09:00
2026-01-21 17:45:12 +00:00
Doctor inspects the service runtime (PID, last exit status) and warns when the
2026-01-08 02:28:21 +01:00
service is installed but not actually running. It also checks for port collisions
on the gateway port (default `18789`) and reports likely causes (gateway already
running, SSH tunnel).
2026-01-11 03:17:06 +01:00
### 17) Gateway runtime best practices
2026-01-31 21:13:13 +09:00
Doctor warns when the gateway service runs on Bun or a version-managed Node path
2026-01-13 08:25:22 +00:00
(`nvm`, `fnm`, `volta`, `asdf`, etc.). WhatsApp + Telegram channels require Node,
2026-01-21 17:45:12 +00:00
and version-manager paths can break after upgrades because the service does not
load your shell init. Doctor offers to migrate to a system Node install when
available (Homebrew/apt/choco).
2026-01-11 03:17:06 +01:00
### 18) Config write + wizard metadata
2026-01-31 21:13:13 +09:00
Doctor persists any config changes and stamps wizard metadata to record the
doctor run.
2026-01-11 03:17:06 +01:00
### 19) Workspace tips (backup + memory system)
2026-01-31 21:13:13 +09:00
Doctor suggests a workspace memory system when missing and prints a backup tip
if the workspace is not already under git.
See [/concepts/agent-workspace](/concepts/agent-workspace) for a full guide to
workspace structure and git backup (recommended private GitHub or GitLab).