2025-12-09 17:51:05 +00:00
---
summary: "Runbook for the Gateway daemon, lifecycle, and operations"
read_when:
- Running or debugging the gateway process
---
2025-12-09 14:41:41 +01:00
# Gateway (daemon) runbook
Last updated: 2025-12-09
## What it is
- The always-on process that owns the single Baileys/Telegram connection and the control/event plane.
2026-01-04 14:32:47 +00:00
- Replaces the legacy `gateway` command. CLI entry point: `clawdbot gateway` .
2025-12-09 14:41:41 +01:00
- Runs until stopped; exits non-zero on fatal errors so the supervisor restarts it.
## How to run (local)
```bash
2026-01-04 14:32:47 +00:00
pnpm clawdbot gateway --port 18789
2025-12-09 17:17:50 +01:00
# for full debug/trace logs in stdio:
2026-01-04 14:32:47 +00:00
pnpm clawdbot gateway --port 18789 --verbose
2025-12-09 16:28:26 +00:00
# if the port is busy, terminate listeners then start:
2026-01-04 14:32:47 +00:00
pnpm clawdbot gateway --force
2025-12-25 18:44:23 +00:00
# dev loop (auto-reload on TS changes):
pnpm gateway:watch
2025-12-09 14:41:41 +01:00
```
2026-01-04 14:32:47 +00:00
- Config hot reload watches `~/.clawdbot/clawdbot.json` (or `CLAWDBOT_CONFIG_PATH` ).
2026-01-03 19:52:24 +00:00
- Default mode: `gateway.reload.mode="hybrid"` (hot-apply safe changes, restart on critical).
- Hot reload uses in-process restart via **SIGUSR1** when needed.
- Disable with `gateway.reload.mode="off"` .
2025-12-09 14:41:41 +01:00
- Binds WebSocket control plane to `127.0.0.1:<port>` (default 18789).
2026-01-03 11:46:58 +01:00
- The same port also serves HTTP (control UI, hooks, A2UI). Single-port multiplex.
2026-01-04 14:32:47 +00:00
- Starts a Canvas file server by default on `canvasHost.port` (default `18793` ), serving `http://<gateway-host>:18793/__clawdbot__/canvas/` from `~/clawd/canvas` . Disable with `canvasHost.enabled=false` or `CLAWDBOT_SKIP_CANVAS_HOST=1` .
2025-12-09 14:41:41 +01:00
- Logs to stdout; use launchd/systemd to keep it alive and rotate logs.
2025-12-09 20:07:24 +00:00
- Pass `--verbose` to mirror debug logging (handshakes, req/res, events) from the log file into stdio when troubleshooting.
2025-12-09 16:28:26 +00:00
- `--force` uses `lsof` to find listeners on the chosen port, sends SIGTERM, logs what it killed, then starts the gateway (fails fast if `lsof` is missing).
2025-12-12 18:28:08 +00:00
- If you run under a supervisor (launchd/systemd/mac app child-process mode), a stop/restart typically sends **SIGTERM** ; older builds may surface this as `pnpm` `ELIFECYCLE` exit code **143** (SIGTERM), which is a normal shutdown, not a crash.
2026-01-03 12:39:52 +01:00
- **SIGUSR1** triggers an in-process restart (no external supervisor required). This is what the `gateway` agent tool uses.
2026-01-04 14:32:47 +00:00
- Optional shared secret: pass `--token <value>` or set `CLAWDBOT_GATEWAY_TOKEN` to require clients to send `connect.params.auth.token` .
- Port precedence: `--port` > `CLAWDBOT_GATEWAY_PORT` > `gateway.port` > default `18789` .
2025-12-09 14:41:41 +01:00
## Remote access
- Tailscale/VPN preferred; otherwise SSH tunnel:
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- Clients then connect to `ws://127.0.0.1:18789` through the tunnel.
2025-12-12 23:29:57 +00:00
- If a token is configured, clients must include it in `connect.params.auth.token` even over the tunnel.
2025-12-09 14:41:41 +01:00
2026-01-03 11:46:58 +01:00
## Multiple gateways (same host)
Supported if you isolate state + config and use unique ports.
2026-01-05 01:25:42 +01:00
### Dev profile (`--dev`)
Fast path: run a fully-isolated dev instance (config/state/workspace) without touching your primary setup.
```bash
clawdbot --dev setup
clawdbot --dev gateway --allow-unconfigured
# then target the dev instance:
clawdbot --dev status
clawdbot --dev health
```
Defaults (can be overridden via env/flags/config):
- `CLAWDBOT_STATE_DIR=~/.clawdbot-dev`
- `CLAWDBOT_CONFIG_PATH=~/.clawdbot-dev/clawdbot.json`
- `CLAWDBOT_GATEWAY_PORT=19001` (Gateway WS + HTTP)
- `bridge.port=19002` (derived: `gateway.port+1` )
- `browser.controlUrl=http://127.0.0.1:19003` (derived: `gateway.port+2` )
- `canvasHost.port=19005` (derived: `gateway.port+4` )
- `agent.workspace` default becomes `~/clawd-dev` when you run `setup` /`onboard` under `--dev` .
2026-01-05 02:03:10 +01:00
Derived ports (rules of thumb):
- Base port = `gateway.port` (or `CLAWDBOT_GATEWAY_PORT` / `--port` )
- `bridge.port = base + 1` (or `CLAWDBOT_BRIDGE_PORT` / config override)
- `browser.controlUrl port = base + 2` (or `CLAWDBOT_BROWSER_CONTROL_URL` / config override)
- `canvasHost.port = base + 4` (or `CLAWDBOT_CANVAS_HOST_PORT` / config override)
- Browser profile CDP ports auto-allocate from `browser.controlPort + 9 .. + 108` (persisted per profile).
2026-01-03 11:46:58 +01:00
Checklist per instance:
- unique `gateway.port`
2026-01-04 14:32:47 +00:00
- unique `CLAWDBOT_CONFIG_PATH`
- unique `CLAWDBOT_STATE_DIR`
2026-01-03 11:46:58 +01:00
- unique `agent.workspace`
- separate WhatsApp numbers (if using WA)
Example:
```bash
2026-01-04 14:32:47 +00:00
CLAWDBOT_CONFIG_PATH=~/.clawdbot/a.json CLAWDBOT_STATE_DIR=~/.clawdbot-a clawdbot gateway --port 19001
CLAWDBOT_CONFIG_PATH=~/.clawdbot/b.json CLAWDBOT_STATE_DIR=~/.clawdbot-b clawdbot gateway --port 19002
2026-01-03 11:46:58 +01:00
```
2025-12-09 14:41:41 +01:00
## Protocol (operator view)
2025-12-17 20:25:40 +01:00
- Mandatory first frame from client: `req {type:"req", id, method:"connect", params:{minProtocol,maxProtocol,client:{name,version,platform,deviceFamily?,modelIdentifier?,mode,instanceId}, caps, auth?, locale?, userAgent? } }` .
2025-12-12 23:29:57 +00:00
- Gateway replies `res {type:"res", id, ok:true, payload:hello-ok }` (or `ok:false` with an error, then closes).
2025-12-09 14:41:41 +01:00
- After handshake:
- Requests: `{type:"req", id, method, params}` → `{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event, payload, seq?, stateVersion?}`
2025-12-17 20:25:40 +01:00
- Structured presence entries: `{host, ip, version, platform?, deviceFamily?, modelIdentifier?, mode, lastInputSeconds?, ts, reason?, tags?[], instanceId? }` .
2025-12-09 14:41:41 +01:00
- `agent` responses are two-stage: first `res` ack `{runId,status:"accepted"}` , then a final `res` `{runId,status:"ok"|"error",summary}` after the run finishes; streamed output arrives as `event:"agent"` .
## Methods (initial set)
2026-01-04 14:32:47 +00:00
- `health` — full health snapshot (same shape as `clawdbot health --json` ).
2025-12-09 14:41:41 +01:00
- `status` — short summary.
- `system-presence` — current presence list.
- `system-event` — post a presence/system note (structured).
- `send` — send a message via the active provider(s).
- `agent` — run an agent turn (streams events back on same connection).
2025-12-18 02:05:26 +00:00
- `node.list` — list paired + currently-connected bridge nodes (includes `caps` , `deviceFamily` , `modelIdentifier` , `paired` , `connected` , and advertised `commands` ).
- `node.describe` — describe a node (capabilities + supported `node.invoke` commands; works for paired nodes and for currently-connected unpaired nodes).
- `node.invoke` — invoke a command on a node (e.g. `canvas.*` , `camera.*` ).
- `node.pair.*` — pairing lifecycle (`request` , `list` , `approve` , `reject` , `verify` ).
2025-12-09 14:41:41 +01:00
2026-01-06 18:59:06 +01:00
See also: [`docs/presence.md` ](https://docs.clawd.bot/presence ) for how presence is produced/deduped and why `instanceId` matters.
2025-12-12 16:56:46 +00:00
2025-12-09 14:41:41 +01:00
## Events
- `agent` — streamed tool/output events from the agent run (seq-tagged).
- `presence` — presence updates (deltas with stateVersion) pushed to all connected clients.
- `tick` — periodic keepalive/no-op to confirm liveness.
- `shutdown` — Gateway is exiting; payload includes `reason` and optional `restartExpectedMs` . Clients should reconnect.
## WebChat integration
2025-12-17 23:05:28 +01:00
- WebChat is a native SwiftUI UI that talks directly to the Gateway WebSocket for history, sends, abort, and events.
- Remote use goes through the same SSH/Tailscale tunnel; if a gateway token is configured, the client includes it during `connect` .
- macOS app connects via a single WS (shared connection); it hydrates presence from the initial snapshot and listens for `presence` events to update the UI.
2025-12-09 14:41:41 +01:00
## Typing and validation
- Server validates every inbound frame with AJV against JSON Schema emitted from the protocol definitions.
2025-12-12 23:29:57 +00:00
- Clients (TS/Swift) consume generated types (TS directly; Swift via the repo’ s generator).
2026-01-04 14:32:47 +00:00
- Types live in `src/gateway/protocol/*.ts` ; regenerate schemas/models with `pnpm protocol:gen` (writes `dist/protocol.schema.json` ) and `pnpm protocol:gen:swift` (writes `apps/macos/Sources/ClawdbotProtocol/GatewayModels.swift` ).
2025-12-09 14:41:41 +01:00
## Connection snapshot
- `hello-ok` includes a `snapshot` with `presence` , `health` , `stateVersion` , and `uptimeMs` plus `policy {maxPayload,maxBufferedBytes,tickIntervalMs}` so clients can render immediately without extra requests.
- `health` /`system-presence` remain available for manual refresh, but are not required at connect time.
## Error codes (res.error shape)
- Errors use `{ code, message, details?, retryable?, retryAfterMs? }` .
- Standard codes:
- `NOT_LINKED` — WhatsApp not authenticated.
- `AGENT_TIMEOUT` — agent did not respond within the configured deadline.
- `INVALID_REQUEST` — schema/param validation failed.
- `UNAVAILABLE` — Gateway is shutting down or a dependency is unavailable.
## Keepalive behavior
- `tick` events (or WS ping/pong) are emitted periodically so clients know the Gateway is alive even when no traffic occurs.
- Send/agent acknowledgements remain separate responses; do not overload ticks for sends.
## Replay / gaps
- Events are not replayed. Clients detect seq gaps and should refresh (`health` + `system-presence` ) before continuing. WebChat and macOS clients now auto-refresh on gap.
## Supervision (macOS example)
- Use launchd to keep the daemon alive:
2026-01-04 14:32:47 +00:00
- Program: path to `clawdbot`
2025-12-09 14:41:41 +01:00
- Arguments: `gateway`
- KeepAlive: true
- StandardOut/Err: file paths or `syslog`
- On failure, launchd restarts; fatal misconfig should keep exiting so the operator notices.
2026-01-05 18:38:30 +01:00
- LaunchAgents are per-user and require a logged-in session; for headless setups use a custom LaunchDaemon (not shipped).
2025-12-09 14:41:41 +01:00
2025-12-19 19:20:57 +01:00
Bundled mac app:
2026-01-04 14:32:47 +00:00
- Clawdbot.app can bundle a bun-compiled gateway binary and install a per-user LaunchAgent labeled `com.clawdbot.gateway` .
2026-01-06 03:25:21 +01:00
- To stop it cleanly, use `clawdbot gateway stop` (or `launchctl bootout gui/$UID/com.clawdbot.gateway` ).
- To restart, use `clawdbot gateway restart` (or `launchctl kickstart -k gui/$UID/com.clawdbot.gateway` ).
2025-12-19 19:20:57 +01:00
2026-01-05 18:38:30 +01:00
## Supervision (systemd user unit)
Create `~/.config/systemd/user/clawdbot-gateway.service` :
2025-12-09 14:41:41 +01:00
```
[Unit]
2026-01-04 14:32:47 +00:00
Description=Clawdbot Gateway
2025-12-09 14:41:41 +01:00
After=network-online.target
Wants=network-online.target
[Service]
2026-01-04 14:32:47 +00:00
ExecStart=/usr/local/bin/clawdbot gateway --port 18789
2026-01-05 18:38:30 +01:00
Restart=always
2025-12-09 14:41:41 +01:00
RestartSec=5
2026-01-04 14:32:47 +00:00
Environment=CLAWDBOT_GATEWAY_TOKEN=
2026-01-05 18:38:30 +01:00
WorkingDirectory=/home/youruser
2025-12-09 14:41:41 +01:00
[Install]
2026-01-05 18:38:30 +01:00
WantedBy=default.target
2025-12-09 14:41:41 +01:00
```
2026-01-05 18:38:30 +01:00
Enable lingering (required so the user service survives logout/idle):
```
sudo loginctl enable-linger youruser
```
2026-01-05 21:19:49 +00:00
Onboarding runs this on Linux (may prompt for sudo; writes `/var/lib/systemd/linger` ).
2026-01-05 18:38:30 +01:00
Then enable the service:
```
systemctl --user enable --now clawdbot-gateway.service
```
2026-01-05 20:33:34 +01:00
**Alternative (system service)** - for always-on or multi-user servers, you can
install a systemd **system** unit instead of a user unit (no lingering needed).
2026-01-05 21:19:49 +00:00
Create `/etc/systemd/system/clawdbot-gateway.service` (copy the unit above,
switch `WantedBy=multi-user.target` , set `User=` + `WorkingDirectory=` ), then:
2026-01-05 20:33:34 +01:00
```
2026-01-05 21:19:49 +00:00
sudo systemctl daemon-reload
2026-01-05 20:33:34 +01:00
sudo systemctl enable --now clawdbot-gateway.service
```
2026-01-05 18:38:30 +01:00
## Supervision (Windows scheduled task)
- Onboarding installs a Scheduled Task named `Clawdbot Gateway` (runs on user logon).
- Requires a logged-in user session; for headless setups use a system service or a task configured to run without a logged-in user (not shipped).
2025-12-09 14:41:41 +01:00
## Operational checks
2025-12-12 23:29:57 +00:00
- Liveness: open WS and send `req:connect` → expect `res` with `payload.type="hello-ok"` (with snapshot).
2025-12-09 14:41:41 +01:00
- Readiness: call `health` → expect `ok: true` and `web.linked=true` .
- Debug: subscribe to `tick` and `presence` events; ensure `status` shows linked/auth age; presence entries show Gateway host and connected clients.
## Safety guarantees
- Only one Gateway per host; all sends/agent calls must go through it.
- No fallback to direct Baileys connections; if the Gateway is down, sends fail fast.
2025-12-12 23:29:57 +00:00
- Non-connect first frames or malformed JSON are rejected and the socket is closed.
2025-12-09 14:41:41 +01:00
- Graceful shutdown: emit `shutdown` event before closing; clients must handle close + reconnect.
## CLI helpers
2026-01-04 14:32:47 +00:00
- `clawdbot gateway health|status` — request health/status over the Gateway WS.
- `clawdbot gateway send --to <num> --message "hi" [--media-url ...]` — send via Gateway (idempotent).
- `clawdbot gateway agent --message "hi" [--to ...]` — run an agent turn (waits for final by default).
- `clawdbot gateway call <method> --params '{"k":"v"}'` — raw method invoker for debugging.
2026-01-06 03:25:21 +01:00
- `clawdbot gateway stop|restart` — stop/restart the supervised gateway service (launchd/systemd/schtasks).
2025-12-10 16:27:54 +00:00
- Gateway helper subcommands assume a running gateway on `--url` ; they no longer auto-spawn one.
2025-12-09 14:41:41 +01:00
## Migration guidance
2026-01-04 14:32:47 +00:00
- Retire uses of `clawdbot gateway` and the legacy TCP control port.
2025-12-12 23:29:57 +00:00
- Update clients to speak the WS protocol with mandatory connect and structured presence.