openclaw/extensions/voice-call/README.md

# @openclaw/voice-call

Official Voice Call plugin for **OpenClaw**.

Providers:

- **Twilio** (Programmable Voice + Media Streams)
- **Telnyx** (Call Control v2)
- **Plivo** (Voice API + XML transfer + GetInput speech)
- **Mock** (dev/no network)

Docs: `https://docs.openclaw.ai/plugins/voice-call`
Plugin system: `https://docs.openclaw.ai/plugin`

## Install (local dev)

### Option A: install via OpenClaw (recommended)

```bash
openclaw plugins install @openclaw/voice-call
```

Restart the Gateway afterwards.

### Option B: copy into your global extensions folder (dev)

```bash
mkdir -p ~/.openclaw/extensions
cp -R extensions/voice-call ~/.openclaw/extensions/voice-call
cd ~/.openclaw/extensions/voice-call && pnpm install
```

## Config

Put under `plugins.entries.voice-call.config`:

```json5
{
  provider: "twilio", // or "telnyx" | "plivo" | "mock"
  fromNumber: "+15550001234",
  toNumber: "+15550005678",

  twilio: {
    accountSid: "ACxxxxxxxx",
    authToken: "your_token",
  },

  telnyx: {
    apiKey: "KEYxxxx",
    connectionId: "CONNxxxx",
    // Telnyx webhook public key from the Telnyx Mission Control Portal
    // (Base64 string; can also be set via TELNYX_PUBLIC_KEY).
    publicKey: "...",
  },

  plivo: {
    authId: "MAxxxxxxxxxxxxxxxxxxxx",
    authToken: "your_token",
  },

  // Webhook server
  serve: {
    port: 3334,
    path: "/voice/webhook",
  },

  // Public exposure (pick one):
  // publicUrl: "https://example.ngrok.app/voice/webhook",
  // tunnel: { provider: "ngrok" },
  // tailscale: { mode: "funnel", path: "/voice/webhook" }

  outbound: {
    defaultMode: "notify", // or "conversation"
  },

  streaming: {
    enabled: true,
    streamPath: "/voice/stream",
    preStartTimeoutMs: 5000,
    maxPendingConnections: 32,
    maxPendingConnectionsPerIp: 4,
    maxConnections: 128,
  },
}
```

Notes:

- Twilio/Telnyx/Plivo require a **publicly reachable** webhook URL.
- `mock` is a local dev provider (no network calls).
- Telnyx requires `telnyx.publicKey` (or `TELNYX_PUBLIC_KEY`) unless `skipSignatureVerification` is true.
- `tunnel.allowNgrokFreeTierLoopbackBypass: true` allows Twilio webhooks with invalid signatures **only** when `tunnel.provider="ngrok"` and `serve.bind` is loopback (ngrok local agent). Use for local dev only.

Streaming security defaults:

- `streaming.preStartTimeoutMs` closes sockets that never send a valid `start` frame.
- `streaming.maxPendingConnections` caps total unauthenticated pre-start sockets.
- `streaming.maxPendingConnectionsPerIp` caps unauthenticated pre-start sockets per source IP.
- `streaming.maxConnections` caps total open media stream sockets (pending + active).

## Stale call reaper

Use `staleCallReaperSeconds` to end calls that never receive a terminal webhook
(for example, notify-mode calls that never complete). The default is `0`
(disabled).

Recommended ranges:

- **Production:** `120`–`300` seconds for notify-style flows.
- Keep this value **higher than `maxDurationSeconds`** so normal calls can
  finish. A good starting point is `maxDurationSeconds + 30–60` seconds.

Example:

```json5
{
  staleCallReaperSeconds: 360,
}
```

## TTS for calls

Voice Call uses the core `messages.tts` configuration (OpenAI or ElevenLabs) for
streaming speech on calls. You can override it under the plugin config with the
same shape — overrides deep-merge with `messages.tts`.

```json5
{
  tts: {
    provider: "openai",
    openai: {
      voice: "alloy",
    },
  },
}
```

Notes:

- Edge TTS is ignored for voice calls (telephony audio needs PCM; Edge output is unreliable).
- Core TTS is used when Twilio media streaming is enabled; otherwise calls fall back to provider native voices.

## CLI

```bash
openclaw voicecall call --to "+15555550123" --message "Hello from OpenClaw"
openclaw voicecall continue --call-id <id> --message "Any questions?"
openclaw voicecall speak --call-id <id> --message "One moment"
openclaw voicecall end --call-id <id>
openclaw voicecall status --call-id <id>
openclaw voicecall tail
openclaw voicecall expose --mode funnel
```

## Tool

Tool name: `voice_call`

Actions:

- `initiate_call` (message, to?, mode?)
- `continue_call` (callId, message)
- `speak_to_user` (callId, message)
- `end_call` (callId)
- `get_status` (callId)

## Gateway RPC

- `voicecall.initiate` (to?, message, mode?)
- `voicecall.continue` (callId, message)
- `voicecall.speak` (callId, message)
- `voicecall.end` (callId)
- `voicecall.status` (callId)

## Notes

- Uses webhook signature verification for Twilio/Telnyx/Plivo.
- Adds replay protection for Twilio and Plivo webhooks (valid duplicate callbacks are ignored safely).
- Twilio speech turns include a per-turn token so stale/replayed callbacks cannot complete a newer turn.
- `responseModel` / `responseSystemPrompt` control AI auto-responses.
- Media streaming requires `ws` and OpenAI Realtime API key.
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								# @openclaw/voice-call
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								Official Voice Call plugin for **OpenClaw**.
-												feat: plugin system + voice-call

											
										
										
											2026-01-12 01:16:39 +00:00
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								Providers:
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- **Twilio** (Programmable Voice + Media Streams)
 								- **Telnyx** (Call Control v2)
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								- **Plivo** (Voice API + XML transfer + GetInput speech)
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- **Mock** (dev/no network)
-												feat: plugin system + voice-call

											
										
										
											2026-01-12 01:16:39 +00:00
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								Docs: `https://docs.openclaw.ai/plugins/voice-call`
 								Plugin system: `https://docs.openclaw.ai/plugin`
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
 								## Install (local dev)
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								### Option A: install via OpenClaw (recommended)
-												feat: plugin system + voice-call

											
										
										
											2026-01-12 01:16:39 +00:00
 								```bash
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								openclaw plugins install @openclaw/voice-call
-												feat: plugin system + voice-call

											
										
										
											2026-01-12 01:16:39 +00:00
+								```
 								Restart the Gateway afterwards.
 								### Option B: copy into your global extensions folder (dev)
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
 								```bash
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								mkdir -p ~/.openclaw/extensions
 								cp -R extensions/voice-call ~/.openclaw/extensions/voice-call
 								cd ~/.openclaw/extensions/voice-call && pnpm install
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
+								```
-												feat: implement voice-call plugin

											
										
										
											2026-01-11 23:23:14 +00:00
+								## Config
 								Put under `plugins.entries.voice-call.config`:
 								```json5
 								{
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								  provider: "twilio", // or "telnyx" | "plivo" | "mock"
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								  fromNumber: "+15550001234",
 								  toNumber: "+15550005678",
-												feat: implement voice-call plugin

											
										
										
											2026-01-11 23:23:14 +00:00
+								  twilio: {
 								    accountSid: "ACxxxxxxxx",
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								    authToken: "your_token",
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								  },
-												refactor(voice-call): centralize Telnyx webhook verification

											
										
										
											2026-02-14 19:00:14 +01:00
+								  telnyx: {
 								    apiKey: "KEYxxxx",
 								    connectionId: "CONNxxxx",
 								    // Telnyx webhook public key from the Telnyx Mission Control Portal
 								    // (Base64 string; can also be set via TELNYX_PUBLIC_KEY).
 								    publicKey: "...",
 								  },
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								  plivo: {
 								    authId: "MAxxxxxxxxxxxxxxxxxxxx",
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								    authToken: "your_token",
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								  },
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								  // Webhook server
 								  serve: {
 								    port: 3334,
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								    path: "/voice/webhook",
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								  },
 								  // Public exposure (pick one):
 								  // publicUrl: "https://example.ngrok.app/voice/webhook",
 								  // tunnel: { provider: "ngrok" },
 								  // tailscale: { mode: "funnel", path: "/voice/webhook" }
 								  outbound: {
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								    defaultMode: "notify", // or "conversation"
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								  },
 								  streaming: {
 								    enabled: true,
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								    streamPath: "/voice/stream",
-												fix(voice-call): harden media stream pre-start websocket handling

											
										
										
											2026-02-22 23:25:11 +01:00
+								    preStartTimeoutMs: 5000,
 								    maxPendingConnections: 32,
 								    maxPendingConnectionsPerIp: 4,
 								    maxConnections: 128,
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								  },
-												feat: implement voice-call plugin

											
										
										
											2026-01-11 23:23:14 +00:00
+								}
 								```
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								Notes:
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								- Twilio/Telnyx/Plivo require a **publicly reachable** webhook URL.
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- `mock` is a local dev provider (no network calls).
-												fix(voice-call): fail closed when Telnyx webhook public key missing

											
										
										
											2026-02-14 18:13:44 +01:00
+								- Telnyx requires `telnyx.publicKey` (or `TELNYX_PUBLIC_KEY`) unless `skipSignatureVerification` is true.
-												fix: gate ngrok free-tier bypass to loopback

											
										
										
											2026-01-26 22:26:22 +00:00
+								- `tunnel.allowNgrokFreeTierLoopbackBypass: true` allows Twilio webhooks with invalid signatures **only** when `tunnel.provider="ngrok"` and `serve.bind` is loopback (ngrok local agent). Use for local dev only.
-												docs(voice-call): add Twilio setup guide

											
										
										
											2026-01-12 02:16:14 +00:00
-												fix(voice-call): harden media stream pre-start websocket handling

											
										
										
											2026-02-22 23:25:11 +01:00
+								Streaming security defaults:
 								- `streaming.preStartTimeoutMs` closes sockets that never send a valid `start` frame.
 								- `streaming.maxPendingConnections` caps total unauthenticated pre-start sockets.
 								- `streaming.maxPendingConnectionsPerIp` caps unauthenticated pre-start sockets per source IP.
 								- `streaming.maxConnections` caps total open media stream sockets (pending + active).
-												docs(voice-call): document stale call reaper config

											
										
										
											2026-02-16 22:16:31 -05:00
+								## Stale call reaper
 								Use `staleCallReaperSeconds` to end calls that never receive a terminal webhook
 								(for example, notify-mode calls that never complete). The default is `0`
 								(disabled).
 								Recommended ranges:
 								- **Production:** `120`–`300` seconds for notify-style flows.
 								- Keep this value **higher than `maxDurationSeconds`** so normal calls can
 								  finish. A good starting point is `maxDurationSeconds + 30–60` seconds.
 								Example:
 								```json5
 								{
 								  staleCallReaperSeconds: 360,
 								}
 								```
-												refactor: align voice-call TTS with core config

											
										
										
											2026-01-25 09:29:50 +00:00
+								## TTS for calls
 								Voice Call uses the core `messages.tts` configuration (OpenAI or ElevenLabs) for
 								streaming speech on calls. You can override it under the plugin config with the
 								same shape — overrides deep-merge with `messages.tts`.
 								```json5
 								{
 								  tts: {
 								    provider: "openai",
 								    openai: {
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
+								      voice: "alloy",
 								    },
 								  },
-												refactor: align voice-call TTS with core config

											
										
										
											2026-01-25 09:29:50 +00:00
+								}
 								```
 								Notes:
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
-												refactor: align voice-call TTS with core config

											
										
										
											2026-01-25 09:29:50 +00:00
+								- Edge TTS is ignored for voice calls (telephony audio needs PCM; Edge output is unreliable).
 								- Core TTS is used when Twilio media streaming is enabled; otherwise calls fall back to provider native voices.
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
+								## CLI
 								```bash
-												refactor: rename to openclaw

											
										
										
											2026-01-30 03:15:10 +01:00
+								openclaw voicecall call --to "+15555550123" --message "Hello from OpenClaw"
 								openclaw voicecall continue --call-id <id> --message "Any questions?"
 								openclaw voicecall speak --call-id <id> --message "One moment"
 								openclaw voicecall end --call-id <id>
 								openclaw voicecall status --call-id <id>
 								openclaw voicecall tail
 								openclaw voicecall expose --mode funnel
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
+								```
 								## Tool
 								Tool name: `voice_call`
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								Actions:
-												chore: Run `pnpm format:fix`.

											
										
										
											2026-01-31 21:13:13 +09:00
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- `initiate_call` (message, to?, mode?)
 								- `continue_call` (callId, message)
 								- `speak_to_user` (callId, message)
 								- `end_call` (callId)
 								- `get_status` (callId)
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
 								## Gateway RPC
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- `voicecall.initiate` (to?, message, mode?)
 								- `voicecall.continue` (callId, message)
 								- `voicecall.speak` (callId, message)
 								- `voicecall.end` (callId)
 								- `voicecall.status` (callId)
-												feat: add plugin architecture

											
										
										
											2026-01-11 12:11:12 +00:00
 								## Notes
-												Voice Call: add Plivo provider

											
										
										
											2026-01-13 17:16:02 +05:30
+								- Uses webhook signature verification for Twilio/Telnyx/Plivo.
-												fix(voice-call): block Twilio webhook replay and stale transitions

											
										
										
											2026-02-24 02:37:04 +00:00
+								- Adds replay protection for Twilio and Plivo webhooks (valid duplicate callbacks are ignored safely).
 								- Twilio speech turns include a per-turn token so stale/replayed callbacks cannot complete a newer turn.
-												feat: restore voice-call plugin parity

											
										
										
											2026-01-12 21:40:22 +00:00
+								- `responseModel` / `responseSystemPrompt` control AI auto-responses.
 								- Media streaming requires `ws` and OpenAI Realtime API key.