co-mono/packages/coding-agent/docs/rpc.md
Mario Zechner df3f5f41c0 Rename /branch command to /fork
- RPC: branch -> fork, get_branch_messages -> get_fork_messages
- SDK: branch() -> fork(), getBranchMessages() -> getForkMessages()
- AgentSession: branch() -> fork(), getUserMessagesForBranching() -> getUserMessagesForForking()
- Extension events: session_before_branch -> session_before_fork, session_branch -> session_fork
- Settings: doubleEscapeAction 'branch' -> 'fork'

fixes #641
2026-01-11 23:12:31 +01:00

1046 lines
23 KiB
Markdown

# RPC Mode
RPC mode enables headless operation of the coding agent via a JSON protocol over stdin/stdout. This is useful for embedding the agent in other applications, IDEs, or custom UIs.
**Note for Node.js/TypeScript users**: If you're building a Node.js application, consider using `AgentSession` directly from `@mariozechner/pi-coding-agent` instead of spawning a subprocess. See [`src/core/agent-session.ts`](../src/core/agent-session.ts) for the API. For a subprocess-based TypeScript client, see [`src/modes/rpc/rpc-client.ts`](../src/modes/rpc/rpc-client.ts).
## Starting RPC Mode
```bash
pi --mode rpc [options]
```
Common options:
- `--provider <name>`: Set the LLM provider (anthropic, openai, google, etc.)
- `--model <id>`: Set the model ID
- `--no-session`: Disable session persistence
- `--session-dir <path>`: Custom session storage directory
## Protocol Overview
- **Commands**: JSON objects sent to stdin, one per line
- **Responses**: JSON objects with `type: "response"` indicating command success/failure
- **Events**: Agent events streamed to stdout as JSON lines
All commands support an optional `id` field for request/response correlation. If provided, the corresponding response will include the same `id`.
## Commands
### Prompting
#### prompt
Send a user prompt to the agent. Returns immediately; events stream asynchronously.
```json
{"id": "req-1", "type": "prompt", "message": "Hello, world!"}
```
With images:
```json
{"type": "prompt", "message": "What's in this image?", "images": [{"type": "image", "source": {"type": "base64", "mediaType": "image/png", "data": "..."}}]}
```
**During streaming**: If the agent is already streaming, you must specify `streamingBehavior` to queue the message:
```json
{"type": "prompt", "message": "New instruction", "streamingBehavior": "steer"}
```
- `"steer"`: Interrupt the agent mid-run. Message is delivered after current tool execution, remaining tools are skipped.
- `"followUp"`: Wait until the agent finishes. Message is delivered only when agent stops.
If the agent is streaming and no `streamingBehavior` is specified, the command returns an error.
**Extension commands**: If the message is a hook command (e.g., `/mycommand`), it executes immediately even during streaming. Extension commands manage their own LLM interaction via `pi.sendMessage()`.
**Prompt templates**: File-based prompt templates (from `.md` files) are expanded before sending/queueing.
Response:
```json
{"id": "req-1", "type": "response", "command": "prompt", "success": true}
```
The `images` field is optional. Each image uses `ImageContent` format with base64 or URL source.
#### steer
Queue a steering message to interrupt the agent mid-run. Delivered after current tool execution, remaining tools are skipped. File-based prompt templates are expanded. Extension commands are not allowed (use `prompt` instead).
```json
{"type": "steer", "message": "Stop and do this instead"}
```
Response:
```json
{"type": "response", "command": "steer", "success": true}
```
See [set_steering_mode](#set_steering_mode) for controlling how steering messages are processed.
#### follow_up
Queue a follow-up message to be processed after the agent finishes. Delivered only when agent has no more tool calls or steering messages. File-based prompt templates are expanded. Extension commands are not allowed (use `prompt` instead).
```json
{"type": "follow_up", "message": "After you're done, also do this"}
```
Response:
```json
{"type": "response", "command": "follow_up", "success": true}
```
See [set_follow_up_mode](#set_follow_up_mode) for controlling how follow-up messages are processed.
#### abort
Abort the current agent operation.
```json
{"type": "abort"}
```
Response:
```json
{"type": "response", "command": "abort", "success": true}
```
#### new_session
Start a fresh session. Can be cancelled by a `session_before_switch` hook.
```json
{"type": "new_session"}
```
With optional parent session tracking:
```json
{"type": "new_session", "parentSession": "/path/to/parent-session.jsonl"}
```
Response:
```json
{"type": "response", "command": "new_session", "success": true, "data": {"cancelled": false}}
```
If a hook cancelled:
```json
{"type": "response", "command": "new_session", "success": true, "data": {"cancelled": true}}
```
### State
#### get_state
Get current session state.
```json
{"type": "get_state"}
```
Response:
```json
{
"type": "response",
"command": "get_state",
"success": true,
"data": {
"model": {...},
"thinkingLevel": "medium",
"isStreaming": false,
"isCompacting": false,
"steeringMode": "all",
"followUpMode": "one-at-a-time",
"sessionFile": "/path/to/session.jsonl",
"sessionId": "abc123",
"autoCompactionEnabled": true,
"messageCount": 5,
"pendingMessageCount": 0
}
}
```
The `model` field is a full [Model](#model) object or `null`.
#### get_messages
Get all messages in the conversation.
```json
{"type": "get_messages"}
```
Response:
```json
{
"type": "response",
"command": "get_messages",
"success": true,
"data": {"messages": [...]}
}
```
Messages are `AgentMessage` objects (see [Message Types](#message-types)).
### Model
#### set_model
Switch to a specific model.
```json
{"type": "set_model", "provider": "anthropic", "modelId": "claude-sonnet-4-20250514"}
```
Response contains the full [Model](#model) object:
```json
{
"type": "response",
"command": "set_model",
"success": true,
"data": {...}
}
```
#### cycle_model
Cycle to the next available model. Returns `null` data if only one model available.
```json
{"type": "cycle_model"}
```
Response:
```json
{
"type": "response",
"command": "cycle_model",
"success": true,
"data": {
"model": {...},
"thinkingLevel": "medium",
"isScoped": false
}
}
```
The `model` field is a full [Model](#model) object.
#### get_available_models
List all configured models.
```json
{"type": "get_available_models"}
```
Response contains an array of full [Model](#model) objects:
```json
{
"type": "response",
"command": "get_available_models",
"success": true,
"data": {
"models": [...]
}
}
```
### Thinking
#### set_thinking_level
Set the reasoning/thinking level for models that support it.
```json
{"type": "set_thinking_level", "level": "high"}
```
Levels: `"off"`, `"minimal"`, `"low"`, `"medium"`, `"high"`, `"xhigh"`
Note: `"xhigh"` is only supported by OpenAI codex-max models.
Response:
```json
{"type": "response", "command": "set_thinking_level", "success": true}
```
#### cycle_thinking_level
Cycle through available thinking levels. Returns `null` data if model doesn't support thinking.
```json
{"type": "cycle_thinking_level"}
```
Response:
```json
{
"type": "response",
"command": "cycle_thinking_level",
"success": true,
"data": {"level": "high"}
}
```
### Queue Modes
#### set_steering_mode
Control how steering messages (from `steer`) are delivered.
```json
{"type": "set_steering_mode", "mode": "one-at-a-time"}
```
Modes:
- `"all"`: Deliver all steering messages at the next interruption point
- `"one-at-a-time"`: Deliver one steering message per interruption (default)
Response:
```json
{"type": "response", "command": "set_steering_mode", "success": true}
```
#### set_follow_up_mode
Control how follow-up messages (from `follow_up`) are delivered.
```json
{"type": "set_follow_up_mode", "mode": "one-at-a-time"}
```
Modes:
- `"all"`: Deliver all follow-up messages when agent finishes
- `"one-at-a-time"`: Deliver one follow-up message per agent completion (default)
Response:
```json
{"type": "response", "command": "set_follow_up_mode", "success": true}
```
### Compaction
#### compact
Manually compact conversation context to reduce token usage.
```json
{"type": "compact"}
```
With custom instructions:
```json
{"type": "compact", "customInstructions": "Focus on code changes"}
```
Response:
```json
{
"type": "response",
"command": "compact",
"success": true,
"data": {
"summary": "Summary of conversation...",
"firstKeptEntryId": "abc123",
"tokensBefore": 150000,
"details": {}
}
}
```
#### set_auto_compaction
Enable or disable automatic compaction when context is nearly full.
```json
{"type": "set_auto_compaction", "enabled": true}
```
Response:
```json
{"type": "response", "command": "set_auto_compaction", "success": true}
```
### Retry
#### set_auto_retry
Enable or disable automatic retry on transient errors (overloaded, rate limit, 5xx).
```json
{"type": "set_auto_retry", "enabled": true}
```
Response:
```json
{"type": "response", "command": "set_auto_retry", "success": true}
```
#### abort_retry
Abort an in-progress retry (cancel the delay and stop retrying).
```json
{"type": "abort_retry"}
```
Response:
```json
{"type": "response", "command": "abort_retry", "success": true}
```
### Bash
#### bash
Execute a shell command and add output to conversation context.
```json
{"type": "bash", "command": "ls -la"}
```
Response:
```json
{
"type": "response",
"command": "bash",
"success": true,
"data": {
"output": "total 48\ndrwxr-xr-x ...",
"exitCode": 0,
"cancelled": false,
"truncated": false
}
}
```
If output was truncated, includes `fullOutputPath`:
```json
{
"type": "response",
"command": "bash",
"success": true,
"data": {
"output": "truncated output...",
"exitCode": 0,
"cancelled": false,
"truncated": true,
"fullOutputPath": "/tmp/pi-bash-abc123.log"
}
}
```
**How bash results reach the LLM:**
The `bash` command executes immediately and returns a `BashResult`. Internally, a `BashExecutionMessage` is created and stored in the agent's message state. This message does NOT emit an event.
When the next `prompt` command is sent, all messages (including `BashExecutionMessage`) are transformed before being sent to the LLM. The `BashExecutionMessage` is converted to a `UserMessage` with this format:
```
Ran `ls -la`
\`\`\`
total 48
drwxr-xr-x ...
\`\`\`
```
This means:
1. Bash output is included in the LLM context on the **next prompt**, not immediately
2. Multiple bash commands can be executed before a prompt; all outputs will be included
3. No event is emitted for the `BashExecutionMessage` itself
#### abort_bash
Abort a running bash command.
```json
{"type": "abort_bash"}
```
Response:
```json
{"type": "response", "command": "abort_bash", "success": true}
```
### Session
#### get_session_stats
Get token usage and cost statistics.
```json
{"type": "get_session_stats"}
```
Response:
```json
{
"type": "response",
"command": "get_session_stats",
"success": true,
"data": {
"sessionFile": "/path/to/session.jsonl",
"sessionId": "abc123",
"userMessages": 5,
"assistantMessages": 5,
"toolCalls": 12,
"toolResults": 12,
"totalMessages": 22,
"tokens": {
"input": 50000,
"output": 10000,
"cacheRead": 40000,
"cacheWrite": 5000,
"total": 105000
},
"cost": 0.45
}
}
```
#### export_html
Export session to an HTML file.
```json
{"type": "export_html"}
```
With custom path:
```json
{"type": "export_html", "outputPath": "/tmp/session.html"}
```
Response:
```json
{
"type": "response",
"command": "export_html",
"success": true,
"data": {"path": "/tmp/session.html"}
}
```
#### switch_session
Load a different session file. Can be cancelled by a `before_switch` hook.
```json
{"type": "switch_session", "sessionPath": "/path/to/session.jsonl"}
```
Response:
```json
{"type": "response", "command": "switch_session", "success": true, "data": {"cancelled": false}}
```
If a hook cancelled the switch:
```json
{"type": "response", "command": "switch_session", "success": true, "data": {"cancelled": true}}
```
#### fork
Create a new fork from a previous user message. Can be cancelled by a `before_fork` hook. Returns the text of the message being forked from.
```json
{"type": "fork", "entryId": "abc123"}
```
Response:
```json
{
"type": "response",
"command": "fork",
"success": true,
"data": {"text": "The original prompt text...", "cancelled": false}
}
```
If a hook cancelled the fork:
```json
{
"type": "response",
"command": "fork",
"success": true,
"data": {"text": "The original prompt text...", "cancelled": true}
}
```
#### get_fork_messages
Get user messages available for forking.
```json
{"type": "get_fork_messages"}
```
Response:
```json
{
"type": "response",
"command": "get_fork_messages",
"success": true,
"data": {
"messages": [
{"entryId": "abc123", "text": "First prompt..."},
{"entryId": "def456", "text": "Second prompt..."}
]
}
}
```
#### get_last_assistant_text
Get the text content of the last assistant message.
```json
{"type": "get_last_assistant_text"}
```
Response:
```json
{
"type": "response",
"command": "get_last_assistant_text",
"success": true,
"data": {"text": "The assistant's response..."}
}
```
Returns `{"text": null}` if no assistant messages exist.
## Events
Events are streamed to stdout as JSON lines during agent operation. Events do NOT include an `id` field (only responses do).
### Event Types
| Event | Description |
|-------|-------------|
| `agent_start` | Agent begins processing |
| `agent_end` | Agent completes (includes all generated messages) |
| `turn_start` | New turn begins |
| `turn_end` | Turn completes (includes assistant message and tool results) |
| `message_start` | Message begins |
| `message_update` | Streaming update (text/thinking/toolcall deltas) |
| `message_end` | Message completes |
| `tool_execution_start` | Tool begins execution |
| `tool_execution_update` | Tool execution progress (streaming output) |
| `tool_execution_end` | Tool completes |
| `auto_compaction_start` | Auto-compaction begins |
| `auto_compaction_end` | Auto-compaction completes |
| `auto_retry_start` | Auto-retry begins (after transient error) |
| `auto_retry_end` | Auto-retry completes (success or final failure) |
| `hook_error` | Hook threw an error |
### agent_start
Emitted when the agent begins processing a prompt.
```json
{"type": "agent_start"}
```
### agent_end
Emitted when the agent completes. Contains all messages generated during this run.
```json
{
"type": "agent_end",
"messages": [...]
}
```
### turn_start / turn_end
A turn consists of one assistant response plus any resulting tool calls and results.
```json
{"type": "turn_start"}
```
```json
{
"type": "turn_end",
"message": {...},
"toolResults": [...]
}
```
### message_start / message_end
Emitted when a message begins and completes. The `message` field contains an `AgentMessage`.
```json
{"type": "message_start", "message": {...}}
{"type": "message_end", "message": {...}}
```
### message_update (Streaming)
Emitted during streaming of assistant messages. Contains both the partial message and a streaming delta event.
```json
{
"type": "message_update",
"message": {...},
"assistantMessageEvent": {
"type": "text_delta",
"contentIndex": 0,
"delta": "Hello ",
"partial": {...}
}
}
```
The `assistantMessageEvent` field contains one of these delta types:
| Type | Description |
|------|-------------|
| `start` | Message generation started |
| `text_start` | Text content block started |
| `text_delta` | Text content chunk |
| `text_end` | Text content block ended |
| `thinking_start` | Thinking block started |
| `thinking_delta` | Thinking content chunk |
| `thinking_end` | Thinking block ended |
| `toolcall_start` | Tool call started |
| `toolcall_delta` | Tool call arguments chunk |
| `toolcall_end` | Tool call ended (includes full `toolCall` object) |
| `done` | Message complete (reason: `"stop"`, `"length"`, `"toolUse"`) |
| `error` | Error occurred (reason: `"aborted"`, `"error"`) |
Example streaming a text response:
```json
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_start","contentIndex":0,"partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":"Hello","partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":" world","partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_end","contentIndex":0,"content":"Hello world","partial":{...}}}
```
### tool_execution_start / tool_execution_update / tool_execution_end
Emitted when a tool begins, streams progress, and completes execution.
```json
{
"type": "tool_execution_start",
"toolCallId": "call_abc123",
"toolName": "bash",
"args": {"command": "ls -la"}
}
```
During execution, `tool_execution_update` events stream partial results (e.g., bash output as it arrives):
```json
{
"type": "tool_execution_update",
"toolCallId": "call_abc123",
"toolName": "bash",
"args": {"command": "ls -la"},
"partialResult": {
"content": [{"type": "text", "text": "partial output so far..."}],
"details": {"truncation": null, "fullOutputPath": null}
}
}
```
When complete:
```json
{
"type": "tool_execution_end",
"toolCallId": "call_abc123",
"toolName": "bash",
"result": {
"content": [{"type": "text", "text": "total 48\n..."}],
"details": {...}
},
"isError": false
}
```
Use `toolCallId` to correlate events. The `partialResult` in `tool_execution_update` contains the accumulated output so far (not just the delta), allowing clients to simply replace their display on each update.
### auto_compaction_start / auto_compaction_end
Emitted when automatic compaction runs (when context is nearly full).
```json
{"type": "auto_compaction_start", "reason": "threshold"}
```
The `reason` field is `"threshold"` (context getting large) or `"overflow"` (context exceeded limit).
```json
{
"type": "auto_compaction_end",
"result": {
"summary": "Summary of conversation...",
"firstKeptEntryId": "abc123",
"tokensBefore": 150000,
"details": {}
},
"aborted": false,
"willRetry": false
}
```
If `reason` was `"overflow"` and compaction succeeds, `willRetry` is `true` and the agent will automatically retry the prompt.
If compaction was aborted, `result` is `null` and `aborted` is `true`.
### auto_retry_start / auto_retry_end
Emitted when automatic retry is triggered after a transient error (overloaded, rate limit, 5xx).
```json
{
"type": "auto_retry_start",
"attempt": 1,
"maxAttempts": 3,
"delayMs": 2000,
"errorMessage": "529 {\"type\":\"error\",\"error\":{\"type\":\"overloaded_error\",\"message\":\"Overloaded\"}}"
}
```
```json
{
"type": "auto_retry_end",
"success": true,
"attempt": 2
}
```
On final failure (max retries exceeded):
```json
{
"type": "auto_retry_end",
"success": false,
"attempt": 3,
"finalError": "529 overloaded_error: Overloaded"
}
```
### hook_error
Emitted when a hook throws an error.
```json
{
"type": "hook_error",
"hookPath": "/path/to/hook.ts",
"event": "tool_call",
"error": "Error message..."
}
```
## Error Handling
Failed commands return a response with `success: false`:
```json
{
"type": "response",
"command": "set_model",
"success": false,
"error": "Model not found: invalid/model"
}
```
Parse errors:
```json
{
"type": "response",
"command": "parse",
"success": false,
"error": "Failed to parse command: Unexpected token..."
}
```
## Types
Source files:
- [`packages/ai/src/types.ts`](../../ai/src/types.ts) - `Model`, `UserMessage`, `AssistantMessage`, `ToolResultMessage`
- [`packages/agent/src/types.ts`](../../agent/src/types.ts) - `AgentMessage`, `AgentEvent`
- [`src/core/messages.ts`](../src/core/messages.ts) - `BashExecutionMessage`
- [`src/modes/rpc/rpc-types.ts`](../src/modes/rpc/rpc-types.ts) - RPC command/response types
### Model
```json
{
"id": "claude-sonnet-4-20250514",
"name": "Claude Sonnet 4",
"api": "anthropic-messages",
"provider": "anthropic",
"baseUrl": "https://api.anthropic.com",
"reasoning": true,
"input": ["text", "image"],
"contextWindow": 200000,
"maxTokens": 16384,
"cost": {
"input": 3.0,
"output": 15.0,
"cacheRead": 0.3,
"cacheWrite": 3.75
}
}
```
### UserMessage
```json
{
"role": "user",
"content": "Hello!",
"timestamp": 1733234567890,
"attachments": []
}
```
The `content` field can be a string or an array of `TextContent`/`ImageContent` blocks.
### AssistantMessage
```json
{
"role": "assistant",
"content": [
{"type": "text", "text": "Hello! How can I help?"},
{"type": "thinking", "thinking": "User is greeting me..."},
{"type": "toolCall", "id": "call_123", "name": "bash", "arguments": {"command": "ls"}}
],
"api": "anthropic-messages",
"provider": "anthropic",
"model": "claude-sonnet-4-20250514",
"usage": {
"input": 100,
"output": 50,
"cacheRead": 0,
"cacheWrite": 0,
"cost": {"input": 0.0003, "output": 0.00075, "cacheRead": 0, "cacheWrite": 0, "total": 0.00105}
},
"stopReason": "stop",
"timestamp": 1733234567890
}
```
Stop reasons: `"stop"`, `"length"`, `"toolUse"`, `"error"`, `"aborted"`
### ToolResultMessage
```json
{
"role": "toolResult",
"toolCallId": "call_123",
"toolName": "bash",
"content": [{"type": "text", "text": "total 48\ndrwxr-xr-x ..."}],
"isError": false,
"timestamp": 1733234567890
}
```
### BashExecutionMessage
Created by the `bash` RPC command (not by LLM tool calls):
```json
{
"role": "bashExecution",
"command": "ls -la",
"output": "total 48\ndrwxr-xr-x ...",
"exitCode": 0,
"cancelled": false,
"truncated": false,
"fullOutputPath": null,
"timestamp": 1733234567890
}
```
### Attachment
```json
{
"id": "img1",
"type": "image",
"fileName": "photo.jpg",
"mimeType": "image/jpeg",
"size": 102400,
"content": "base64-encoded-data...",
"extractedText": null,
"preview": null
}
```
## Example: Basic Client (Python)
```python
import subprocess
import json
proc = subprocess.Popen(
["pi", "--mode", "rpc", "--no-session"],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
text=True
)
def send(cmd):
proc.stdin.write(json.dumps(cmd) + "\n")
proc.stdin.flush()
def read_events():
for line in proc.stdout:
yield json.loads(line)
# Send prompt
send({"type": "prompt", "message": "Hello!"})
# Process events
for event in read_events():
if event.get("type") == "message_update":
delta = event.get("assistantMessageEvent", {})
if delta.get("type") == "text_delta":
print(delta["delta"], end="", flush=True)
if event.get("type") == "agent_end":
print()
break
```
## Example: Interactive Client (Node.js)
See [`test/rpc-example.ts`](../test/rpc-example.ts) for a complete interactive example, or [`src/modes/rpc/rpc-client.ts`](../src/modes/rpc/rpc-client.ts) for a typed client implementation.
```javascript
const { spawn } = require("child_process");
const readline = require("readline");
const agent = spawn("pi", ["--mode", "rpc", "--no-session"]);
readline.createInterface({ input: agent.stdout }).on("line", (line) => {
const event = JSON.parse(line);
if (event.type === "message_update") {
const { assistantMessageEvent } = event;
if (assistantMessageEvent.type === "text_delta") {
process.stdout.write(assistantMessageEvent.delta);
}
}
});
// Send prompt
agent.stdin.write(JSON.stringify({ type: "prompt", message: "Hello" }) + "\n");
// Abort on Ctrl+C
process.on("SIGINT", () => {
agent.stdin.write(JSON.stringify({ type: "abort" }) + "\n");
});
```