summary: "RFC: Cron jobs + wakeups for Clawd/Clawdis (main vs isolated sessions)"
read_when:
- Designing scheduled jobs, alarms, or wakeups
- Adding Gateway methods or CLI commands for automation
- Adjusting heartbeat behavior or session routing
---
# RFC: Cron jobs + wakeups for Clawd
Status: Draft
Last updated: 2025-12-13
## Context
Clawdis already has:
- A **periodic reply heartbeat** that runs the agent with `HEARTBEAT /think:high` and suppresses `HEARTBEAT_OK` (`src/web/auto-reply.ts`).
- A lightweight, in-memory **system event queue** (`enqueueSystemEvent`) that is injected into the next **main session** turn (`drainSystemEvents` in `src/auto-reply/reply.ts`).
- A WebSocket **Gateway** daemon that is intended to be always-on (`docs/gateway.md`).
This RFC adds a small “cron job system” so Clawd can schedule future work and reliably wake itself up:
- **Delayed**: run on the *next* normal heartbeat tick
- **Immediate**: run *now* (trigger a heartbeat immediately)
- **Isolated jobs**: optionally run in their own session that does not pollute the main session and can run concurrently (within configured limits).
## Goals
- Provide a **persistent job store** and an **in-process scheduler** owned by the Gateway.
- Allow each job to target either:
-`sessionTarget: "main"`: inject as `System:` lines and rely on the main heartbeat (or trigger it immediately).
-`sessionTarget: "isolated"`: run an agent turn in a dedicated session key (job session), optionally delivering a message and/or posting a summary back to main.
- job execution (enqueue system events and/or agent turns).
This keeps scheduling unified with the always-on process and prevents “two schedulers” when multiple CLIs run.
### Timer strategy
- Maintain an in-memory heap/array of enabled jobs keyed by `state.nextRunAtMs`.
- Use a **single `setTimeout`** to wake at the earliest next run.
- On wake:
- compute all due jobs (now >= nextRunAtMs),
- mark them “in flight” (in memory),
- persist updated `state` (at least bump `nextRunAtMs` / `lastRunAtMs`) before starting execution to minimize duplicate runs on crash,
- execute jobs (with concurrency limits),
- persist final `lastStatus/lastError/lastDurationMs`,
- re-arm timer for the next earliest run.
### Schedule parsing
V1 can ship with `at` + `every` without extra deps.
If we add `"kind":"cron"`:
- Use a well-maintained parser (we use `croner`) and support:
- 5-field cron (`min hour dom mon dow`) at minimum
- optional `tz`
- Store `nextRunAtMs` computed by the parser; re-compute after each run.
## Execution semantics
### Main session jobs
Main session jobs do not run the agent directly by default.
When due:
1) `enqueueSystemEvent(job.payload.text)` (or a derived message)
2) If `wakeMode:"now"`, trigger an immediate heartbeat run (see “Heartbeat wake hook”).
3) Otherwise do nothing else (the next scheduled heartbeat will pick up the system event).
Why: This keeps the main session’s “proactive” behavior centralized in the heartbeat rules and avoids ad-hoc agent turns that might fight with inbound message processing.
### Isolated session jobs
Isolated jobs run an agent turn in a dedicated session key, intended to be separate from main.
When due:
- Build a message body that includes schedule metadata, e.g.:
-`"[cron:<jobId>] <job.name>: <payload.message>"`
- Execute via the same agent runner path as other command-mode runs, but pinned to:
-`sessionKey = cron:<jobId>`
-`sessionId = store[sessionKey].sessionId` (create if missing)
- Optionally deliver output (`payload.deliver === true`) to the configured channel/to.
- If `deliver` is omitted/false, nothing is sent to external providers; you still get the main-session summary and can inspect the full isolated transcript in `cron:<jobId>`.
- If the handler is absent (provider not connected), the request is stored as “pending”; the next time the handler is installed, it runs once.
- Coalesce rapid calls and respect the existing “skip when queue busy” behavior (prefer retrying soon vs dropping).
## Run history log (JSONL)
In addition to normal structured logs, the Gateway writes an append-only run history “ledger” (JSONL) whenever a job finishes. This is intended for quick debugging (“did the job run, when, and what happened?”).
Each log line includes (at minimum) job id, status/error, timing, and a `summary` string (systemEvent text for main jobs, and the last agent text output for isolated jobs).