2025-12-13 02:34:11 +00:00
---
2026-01-04 14:32:47 +00:00
summary: "RFC: Cron jobs + wakeups for Clawd/Clawdbot (main vs isolated sessions)"
2025-12-13 02:34:11 +00:00
read_when:
- Designing scheduled jobs, alarms, or wakeups
- Adding Gateway methods or CLI commands for automation
- Adjusting heartbeat behavior or session routing
---
# RFC: Cron jobs + wakeups for Clawd
Status: Draft
Last updated: 2025-12-13
## Context
2026-01-04 14:32:47 +00:00
Clawdbot already has:
2025-12-26 02:35:21 +01:00
- A **gateway heartbeat runner** that runs the agent with `HEARTBEAT` and suppresses `HEARTBEAT_OK` (`src/infra/heartbeat-runner.ts` ).
2025-12-13 02:34:11 +00:00
- A lightweight, in-memory **system event queue** (`enqueueSystemEvent` ) that is injected into the next **main session** turn (`drainSystemEvents` in `src/auto-reply/reply.ts` ).
2026-01-06 18:59:06 +01:00
- A WebSocket **Gateway** daemon that is intended to be always-on ([`docs/gateway.md` ](https://docs.clawd.bot/gateway )).
2025-12-13 02:34:11 +00:00
This RFC adds a small “cron job system” so Clawd can schedule future work and reliably wake itself up:
- **Delayed**: run on the *next* normal heartbeat tick
- **Immediate**: run *now* (trigger a heartbeat immediately)
- **Isolated jobs**: optionally run in their own session that does not pollute the main session and can run concurrently (within configured limits).
## Goals
- Provide a **persistent job store** and an **in-process scheduler** owned by the Gateway.
- Allow each job to target either:
- `sessionTarget: "main"` : inject as `System:` lines and rely on the main heartbeat (or trigger it immediately).
- `sessionTarget: "isolated"` : run an agent turn in a dedicated session key (job session), optionally delivering a message and/or posting a summary back to main.
- Expose a stable control surface:
- **Gateway methods** (`cron.*` , `wake` ) for programmatic usage (mac app, CLI, agents).
2026-01-04 14:32:47 +00:00
- **CLI commands** (`clawdbot cron ...` ) to add/remove/edit/list and to debug `run` .
2025-12-13 02:34:11 +00:00
- Produce clear, structured **logs** for job lifecycle and execution outcomes.
## Non-goals (v1)
- Multi-host distributed scheduling.
- Exactly-once semantics across crashes (we aim for “at-least-once with idempotency hooks”).
- A full Unix-cron parser as the only schedule format (we can support it, but v1 should not require complex cron features to be useful).
## Terminology
- **Wake**: a request to ensure the agent gets a turn soon (either right now or next heartbeat).
- **Main session**: the canonical session bucket (default key `"main"` ) that receives `System:` events.
- **Isolated session**: a per-job session key (e.g. `cron:<jobId>` ) with its own session id / session file.
## User stories
- “Remind me in 20 minutes” → add a one-shot job that triggers an immediate heartbeat at T+20m.
- “Every weekday at 7:30, wake me up and start music” → recurring job, isolated session, deliver to WhatsApp.
- “Every hour, check battery; only interrupt me if < 20 %” → isolated job that decides whether to deliver ; may also post a brief status to main .
- “Next heartbeat, please check calendar” → delayed wake targeting main session.
## Job model
### Storage schema (v1)
Each job is a JSON object with stable keys (unknown keys ignored for forward compatibility):
- `id: string` (UUID)
2025-12-20 19:56:49 +00:00
- `name: string` (required)
- `description?: string` (optional)
2025-12-13 02:34:11 +00:00
- `enabled: boolean`
- `createdAtMs: number`
- `updatedAtMs: number`
- `schedule` (one of)
- `{"kind":"at","atMs":number}` (one-shot)
- `{"kind":"every","everyMs":number,"anchorMs"?:number}` (simple interval)
- `{"kind":"cron","expr":string,"tz"?:string}` (optional; see “Schedule parsing”)
- `sessionTarget: "main" | "isolated"`
- `wakeMode: "next-heartbeat" | "now"`
- For `sessionTarget:"isolated"` , `wakeMode:"now"` means “run immediately when due”.
- For `sessionTarget:"main"` , `wakeMode` controls whether we trigger the heartbeat immediately or just enqueue and wait.
- `payload` (one of)
- `{"kind":"systemEvent","text":string}` (enqueue as `System:` )
2026-01-06 18:25:52 +00:00
- `{"kind":"agentTurn","message":string,"deliver"?:boolean,"provider"?: "last"|"whatsapp"|"telegram"|"discord"|"signal"|"imessage","to"?:string,"timeoutSeconds"?:number}`
2025-12-13 02:34:11 +00:00
- `isolation` (optional; only meaningful for isolated jobs)
2025-12-13 12:09:20 +00:00
- `{"postToMainPrefix"?: string}`
2025-12-13 02:34:11 +00:00
- `runtime` (optional)
- `{"maxAttempts"?:number,"retryBackoffMs"?:number}` (best-effort retries; defaults off)
- `state` (runtime-maintained)
- `{"nextRunAtMs":number,"lastRunAtMs"?:number,"lastStatus"?: "ok"|"error"|"skipped","lastError"?:string,"lastDurationMs"?:number}`
### Key behavior
- `sessionTarget:"main"` jobs always enqueue `payload.kind:"systemEvent"` (directly or derived from `agentTurn` results; see below).
- `sessionTarget:"isolated"` jobs create/use a stable session key: `cron:<jobId>` .
## Storage location
2026-01-04 14:32:47 +00:00
Cron persists everything under `~/.clawdbot/cron/` :
- Job store: `~/.clawdbot/cron/jobs.json`
- Run history: `~/.clawdbot/cron/runs/<jobId>.jsonl`
2025-12-13 02:34:11 +00:00
2025-12-13 12:38:12 +00:00
You can override the job store path via `cron.store` in config.
2025-12-13 02:34:11 +00:00
2026-01-04 14:32:47 +00:00
The scheduler should never require additional configuration for the base directory (Clawdbot already treats `~/.clawdbot` as fixed).
2025-12-13 02:34:11 +00:00
## Enabling
2025-12-13 03:49:29 +00:00
Cron execution is enabled by default inside the Gateway.
To disable it, set:
2025-12-13 02:34:11 +00:00
```json5
{
cron: {
2025-12-13 03:49:29 +00:00
enabled: false,
2025-12-13 02:34:11 +00:00
// optional:
2026-01-04 14:32:47 +00:00
store: "~/.clawdbot/cron/jobs.json",
2025-12-13 02:34:11 +00:00
maxConcurrentRuns: 1
}
}
```
2026-01-04 14:32:47 +00:00
You can also disable scheduling via the environment variable `CLAWDBOT_SKIP_CRON=1` .
2025-12-13 03:49:29 +00:00
2025-12-13 02:34:11 +00:00
## Scheduler design
### Ownership
The Gateway owns:
- the scheduler timer,
- job store reads/writes,
- job execution (enqueue system events and/or agent turns).
This keeps scheduling unified with the always-on process and prevents “two schedulers” when multiple CLIs run.
### Timer strategy
- Maintain an in-memory heap/array of enabled jobs keyed by `state.nextRunAtMs` .
- Use a **single `setTimeout`** to wake at the earliest next run.
- On wake:
- compute all due jobs (now >= nextRunAtMs),
- mark them “in flight” (in memory),
- persist updated `state` (at least bump `nextRunAtMs` / `lastRunAtMs` ) before starting execution to minimize duplicate runs on crash,
- execute jobs (with concurrency limits),
- persist final `lastStatus/lastError/lastDurationMs` ,
- re-arm timer for the next earliest run.
### Schedule parsing
V1 can ship with `at` + `every` without extra deps.
If we add `"kind":"cron"` :
- Use a well-maintained parser (we use `croner` ) and support:
- 5-field cron (`min hour dom mon dow` ) at minimum
- optional `tz`
- Store `nextRunAtMs` computed by the parser; re-compute after each run.
## Execution semantics
### Main session jobs
Main session jobs do not run the agent directly by default.
When due:
1) `enqueueSystemEvent(job.payload.text)` (or a derived message)
2) If `wakeMode:"now"` , trigger an immediate heartbeat run (see “Heartbeat wake hook”).
3) Otherwise do nothing else (the next scheduled heartbeat will pick up the system event).
Why: This keeps the main session’ s “proactive” behavior centralized in the heartbeat rules and avoids ad-hoc agent turns that might fight with inbound message processing.
### Isolated session jobs
Isolated jobs run an agent turn in a dedicated session key, intended to be separate from main.
When due:
- Build a message body that includes schedule metadata, e.g.:
- `"[cron:<jobId>] <job.name>: <payload.message>"`
- Execute via the same agent runner path as other command-mode runs, but pinned to:
- `sessionKey = cron:<jobId>`
- `sessionId = store[sessionKey].sessionId` (create if missing)
2026-01-06 18:25:52 +00:00
- Optionally deliver output (`payload.deliver === true` ) to the configured provider/to.
2025-12-13 11:33:46 +00:00
- Isolated jobs always enqueue a summary system event to the main session when they finish (derived from the last agent text output).
- Prefix defaults to `Cron` , and can be customized via `isolation.postToMainPrefix` .
2025-12-13 12:09:20 +00:00
- If `deliver` is omitted/false, nothing is sent to external providers; you still get the main-session summary and can inspect the full isolated transcript in `cron:<jobId>` .
2025-12-13 02:34:11 +00:00
### “Run in parallel to main”
2026-01-04 14:32:47 +00:00
Clawdbot currently serializes command execution through a global in-process queue (`src/process/command-queue.ts` ) to avoid collisions.
2025-12-13 02:34:11 +00:00
To support isolated cron jobs running “in parallel”, we should introduce **lanes** (keyed queues) plus a global concurrency cap:
- Lane `"main"` : inbound auto-replies + main heartbeat.
- Lane `"cron"` (or `cron:<jobId>` ): isolated jobs.
- Configurable `cron.maxConcurrentRuns` (default 1 or 2).
This yields:
- isolated jobs can overlap with the main lane (up to cap),
- each lane still preserves ordering for its own work (optional),
- we retain safety knobs to prevent runaway resource contention.
## Heartbeat wake hook (immediate vs next heartbeat)
We need a way for the Gateway (or the scheduler) to request an immediate heartbeat without duplicating heartbeat logic.
Design:
2025-12-26 02:35:21 +01:00
- `startHeartbeatRunner` owns the real heartbeat execution and installs a wake handler.
- Wake hook lives in `src/infra/heartbeat-wake.ts` :
- `setHeartbeatWakeHandler(fn | null)` installed by the heartbeat runner
- `requestHeartbeatNow({ reason, coalesceMs? })`
- If the handler is absent, the request is stored as “pending”; the next time the handler is installed, it runs once.
- Coalesce rapid calls and respect the “skip when queue busy” behavior (retry soon vs dropping).
2025-12-13 02:34:11 +00:00
## Run history log (JSONL)
In addition to normal structured logs, the Gateway writes an append-only run history “ledger” (JSONL) whenever a job finishes. This is intended for quick debugging (“did the job run, when, and what happened?”).
Path rules:
2025-12-13 12:38:12 +00:00
- Run logs are stored per job next to the store: `.../runs/<jobId>.jsonl` .
2025-12-13 02:34:11 +00:00
Retention:
- Best-effort pruning when the file grows beyond ~2MB; keep the newest ~2000 lines.
2025-12-13 12:09:20 +00:00
Each log line includes (at minimum) job id, status/error, timing, and a `summary` string (systemEvent text for main jobs, and the last agent text output for isolated jobs).
2026-01-05 23:09:48 -03:00
## Compatibility policy (cron.add/cron.update)
To keep older clients working, the Gateway applies **best-effort normalization** for `cron.add` and `cron.update` :
- Accepts wrapped payloads under `data` or `job` and unwraps them.
- Infers `schedule.kind` from `atMs` , `everyMs` , or `expr` if missing.
- Infers `payload.kind` from `text` (systemEvent) or `message` (agentTurn) if missing.
- Defaults `wakeMode` to `"next-heartbeat"` when omitted.
- Defaults `sessionTarget` based on payload kind (`systemEvent` → `"main"` , `agentTurn` → `"isolated"` ).
Normalization is **compat-only** . New clients should send the full schema (including `kind` , `sessionTarget` , and `wakeMode` ) to avoid ambiguity. Unknown fields are still rejected by schema validation.
2025-12-13 02:34:11 +00:00
## Gateway API
New methods (names can be bikeshed; `cron.*` is suggested):
- `wake`
- params: `{ mode: "now" | "next-heartbeat", text: string }`
- effect: `enqueueSystemEvent(text)` , plus optional immediate heartbeat trigger
- `cron.list`
- params: optional `{ includeDisabled?: boolean }`
- returns: `{ jobs: CronJob[] }`
- `cron.add`
- params: job payload without `id/state` (server generates and returns created job)
- `cron.update`
- params: `{ id: string, patch: Partial<CronJobWritableFields> }`
- `cron.remove`
- params: `{ id: string }`
- `cron.run`
- params: `{ id: string, mode?: "due" | "force" }` (debugging; does not change schedule unless `force` requires it)
- `cron.runs`
2025-12-13 12:38:12 +00:00
- params: `{ id: string, limit?: number }`
2025-12-13 02:34:11 +00:00
- returns: `{ entries: CronRunLogEntry[] }`
2025-12-13 12:38:12 +00:00
- note: `id` is required (runs are stored per-job).
2025-12-13 02:34:11 +00:00
The Gateway should broadcast a `cron` event for UI/debug:
- event: `cron`
- payload: `{ jobId, action: "added"|"updated"|"removed"|"started"|"finished", status?, error?, nextRunAtMs? }`
## CLI surface
Add a `cron` command group (all commands should also support `--json` where sensible):
2026-01-04 14:32:47 +00:00
- `clawdbot cron list [--json] [--all]`
- `clawdbot cron add ...`
2025-12-13 02:34:11 +00:00
- schedule flags:
- `--at <iso8601|ms|relative>` (one-shot)
- `--every <duration>` (e.g. `10m` , `1h` )
- `--cron "<expr>" [--tz "<tz>"]`
- target flags:
- `--session main|isolated`
2025-12-13 03:43:47 +00:00
- `--wake now|next-heartbeat`
2025-12-13 02:34:11 +00:00
- payload flags (choose one):
- `--system-event "<text>"`
2026-01-06 18:25:52 +00:00
- `--message "<agent message>" [--deliver] [--provider last|whatsapp|telegram|discord|slack|signal|imessage] [--to <dest>]`
2025-12-13 02:34:11 +00:00
2026-01-04 14:32:47 +00:00
- `clawdbot cron edit <id> ...` (patch-by-flags, non-interactive)
- `clawdbot cron rm <id>`
- `clawdbot cron enable <id>` / `clawdbot cron disable <id>`
- `clawdbot cron run <id> [--force]` (debug)
- `clawdbot cron runs --id <id> [--limit <n>]` (run history)
- `clawdbot cron status` (scheduler enabled + next wake)
2025-12-13 02:34:11 +00:00
Additionally:
2026-01-04 14:32:47 +00:00
- `clawdbot wake --mode now|next-heartbeat --text "<text>"` as a thin wrapper around `wake` for agents to call.
2025-12-13 02:34:11 +00:00
## Examples
### Run once at a specific time
One-shot reminder that targets the main session and triggers a heartbeat immediately at the scheduled time:
```bash
2026-01-04 14:32:47 +00:00
clawdbot cron add \
2025-12-13 02:34:11 +00:00
--at "2025-12-14T07:00:00-08:00" \
--session main \
--wake now \
--system-event "Alarm: wake up (meeting in 30 minutes)."
```
### Run daily (calendar-accurate)
Daily at 07:00 in a specific timezone (preferred over “every 24h” to avoid DST drift):
```bash
2026-01-04 14:32:47 +00:00
clawdbot cron add \
2025-12-13 02:34:11 +00:00
--cron "0 7 * * *" \
--tz "America/Los_Angeles" \
--session isolated \
--wake now \
--message "Daily check: scan calendar + inbox; deliver only if urgent." \
--deliver \
2026-01-06 18:25:52 +00:00
--provider last
2025-12-13 02:34:11 +00:00
```
### Run weekly (every Wednesday)
Every Wednesday at 09:00:
```bash
2026-01-04 14:32:47 +00:00
clawdbot cron add \
2025-12-13 02:34:11 +00:00
--cron "0 9 * * 3" \
--tz "America/Los_Angeles" \
--session isolated \
--wake now \
--message "Weekly: summarize status and remind me of goals." \
--deliver \
2026-01-06 18:25:52 +00:00
--provider last
2025-12-13 02:34:11 +00:00
```
### “Next heartbeat”
Enqueue a note for the main session but let the existing heartbeat cadence pick it up:
```bash
2026-01-04 14:32:47 +00:00
clawdbot wake --mode next-heartbeat --text "Next heartbeat: check battery + upcoming meetings."
2025-12-13 02:34:11 +00:00
```
## Logging & observability
Logging requirements:
- Use `getChildLogger({ module: "cron", jobId, runId, name })` for every run.
- Log lifecycle:
- store load/save (debug; include job count)
- schedule recompute (debug; include nextRunAt)
- job start/end (info)
- job skipped (info; include reason)
- job error (warn; include error + stack where available)
- Emit a concise user-facing line to stdout when running in CLI mode (similar to heartbeat logs).
Suggested log events:
- `cron: scheduler started` (jobCount, nextWakeAt)
- `cron: job started` (jobId, scheduleKind, sessionTarget, wakeMode)
- `cron: job finished` (status, durationMs, nextRunAtMs)
2025-12-13 03:43:47 +00:00
- When `cron.enabled` is false, the Gateway logs `cron: disabled` and jobs will not run automatically (the CLI warns on `cron add` /`cron edit` ).
2026-01-04 14:32:47 +00:00
- Use `clawdbot cron status` to confirm the scheduler is enabled and see the next wake time.
2025-12-13 02:34:11 +00:00
## Safety & security
- Respect existing allowlists/routing rules: delivery defaults should not send to arbitrary destinations unless explicitly configured.
- Provide a global “kill switch”:
2025-12-13 03:49:29 +00:00
- `cron.enabled: boolean` (default `true` ).
2025-12-13 02:34:11 +00:00
- `gateway method set-heartbeats` already exists; cron should have similar.
- Avoid persistence of sensitive payloads unless requested; job text may contain private content.
## Testing plan (v1)
- Unit tests:
- schedule computation for `at` and `every`
- job store read/write + migration behavior
- lane concurrency: main vs cron overlap is bounded
- “wake now” coalescing and pending behavior when provider not ready
- Integration tests:
2026-01-04 14:32:47 +00:00
- start Gateway with `CLAWDBOT_SKIP_PROVIDERS=1` , add jobs, list/edit/remove
2025-12-13 02:34:11 +00:00
- simulate due jobs and assert `enqueueSystemEvent` called + cron events broadcast
## Rollout plan
1) Add the `wake` primitive + heartbeat wake hook (no persistent jobs yet).
2) Add `cron.*` API and CLI wrappers with `at` + `every` .
3) Add optional cron expression parsing (`kind:"cron"` ) if needed.
4) Add UI surfacing in WebChat/macOS app (optional).