# RPC Mode RPC mode enables headless operation of the coding agent via a JSON protocol over stdin/stdout. This is useful for embedding the agent in other applications, IDEs, or custom UIs. **Note for Node.js/TypeScript users**: If you're building a Node.js application, consider using `AgentSession` directly from `@mariozechner/pi-coding-agent` instead of spawning a subprocess. See [`src/core/agent-session.ts`](../src/core/agent-session.ts) for the API. For a subprocess-based TypeScript client, see [`src/modes/rpc/rpc-client.ts`](../src/modes/rpc/rpc-client.ts). ## Starting RPC Mode ```bash pi --mode rpc [options] ``` Common options: - `--provider `: Set the LLM provider (anthropic, openai, google, etc.) - `--model `: Set the model ID - `--no-session`: Disable session persistence - `--session-dir `: Custom session storage directory ## Protocol Overview - **Commands**: JSON objects sent to stdin, one per line - **Responses**: JSON objects with `type: "response"` indicating command success/failure - **Events**: Agent events streamed to stdout as JSON lines All commands support an optional `id` field for request/response correlation. If provided, the corresponding response will include the same `id`. ## Commands ### Prompting #### prompt Send a user prompt to the agent. Returns immediately; events stream asynchronously. ```json {"id": "req-1", "type": "prompt", "message": "Hello, world!"} ``` With images: ```json {"type": "prompt", "message": "What's in this image?", "images": [{"type": "image", "source": {"type": "base64", "mediaType": "image/png", "data": "..."}}]} ``` **During streaming**: If the agent is already streaming, you must specify `streamingBehavior` to queue the message: ```json {"type": "prompt", "message": "New instruction", "streamingBehavior": "steer"} ``` - `"steer"`: Interrupt the agent mid-run. Message is delivered after current tool execution, remaining tools are skipped. - `"followUp"`: Wait until the agent finishes. Message is delivered only when agent stops. If the agent is streaming and no `streamingBehavior` is specified, the command returns an error. **Extension commands**: If the message is an extension command (e.g., `/mycommand`), it executes immediately even during streaming. Extension commands manage their own LLM interaction via `pi.sendMessage()`. **Input expansion**: Skill commands (`/skill:name`) and prompt templates (`/template`) are expanded before sending/queueing. Response: ```json {"id": "req-1", "type": "response", "command": "prompt", "success": true} ``` The `images` field is optional. Each image uses `ImageContent` format with base64 or URL source. #### steer Queue a steering message to interrupt the agent mid-run. Delivered after current tool execution, remaining tools are skipped. Skill commands and prompt templates are expanded. Extension commands are not allowed (use `prompt` instead). ```json {"type": "steer", "message": "Stop and do this instead"} ``` Response: ```json {"type": "response", "command": "steer", "success": true} ``` See [set_steering_mode](#set_steering_mode) for controlling how steering messages are processed. #### follow_up Queue a follow-up message to be processed after the agent finishes. Delivered only when agent has no more tool calls or steering messages. Skill commands and prompt templates are expanded. Extension commands are not allowed (use `prompt` instead). ```json {"type": "follow_up", "message": "After you're done, also do this"} ``` Response: ```json {"type": "response", "command": "follow_up", "success": true} ``` See [set_follow_up_mode](#set_follow_up_mode) for controlling how follow-up messages are processed. #### abort Abort the current agent operation. ```json {"type": "abort"} ``` Response: ```json {"type": "response", "command": "abort", "success": true} ``` #### new_session Start a fresh session. Can be cancelled by a `session_before_switch` extension event handler. ```json {"type": "new_session"} ``` With optional parent session tracking: ```json {"type": "new_session", "parentSession": "/path/to/parent-session.jsonl"} ``` Response: ```json {"type": "response", "command": "new_session", "success": true, "data": {"cancelled": false}} ``` If an extension cancelled: ```json {"type": "response", "command": "new_session", "success": true, "data": {"cancelled": true}} ``` ### State #### get_state Get current session state. ```json {"type": "get_state"} ``` Response: ```json { "type": "response", "command": "get_state", "success": true, "data": { "model": {...}, "thinkingLevel": "medium", "isStreaming": false, "isCompacting": false, "steeringMode": "all", "followUpMode": "one-at-a-time", "sessionFile": "/path/to/session.jsonl", "sessionId": "abc123", "autoCompactionEnabled": true, "messageCount": 5, "pendingMessageCount": 0 } } ``` The `model` field is a full [Model](#model) object or `null`. #### get_messages Get all messages in the conversation. ```json {"type": "get_messages"} ``` Response: ```json { "type": "response", "command": "get_messages", "success": true, "data": {"messages": [...]} } ``` Messages are `AgentMessage` objects (see [Message Types](#message-types)). ### Model #### set_model Switch to a specific model. ```json {"type": "set_model", "provider": "anthropic", "modelId": "claude-sonnet-4-20250514"} ``` Response contains the full [Model](#model) object: ```json { "type": "response", "command": "set_model", "success": true, "data": {...} } ``` #### cycle_model Cycle to the next available model. Returns `null` data if only one model available. ```json {"type": "cycle_model"} ``` Response: ```json { "type": "response", "command": "cycle_model", "success": true, "data": { "model": {...}, "thinkingLevel": "medium", "isScoped": false } } ``` The `model` field is a full [Model](#model) object. #### get_available_models List all configured models. ```json {"type": "get_available_models"} ``` Response contains an array of full [Model](#model) objects: ```json { "type": "response", "command": "get_available_models", "success": true, "data": { "models": [...] } } ``` ### Thinking #### set_thinking_level Set the reasoning/thinking level for models that support it. ```json {"type": "set_thinking_level", "level": "high"} ``` Levels: `"off"`, `"minimal"`, `"low"`, `"medium"`, `"high"`, `"xhigh"` Note: `"xhigh"` is only supported by OpenAI codex-max models. Response: ```json {"type": "response", "command": "set_thinking_level", "success": true} ``` #### cycle_thinking_level Cycle through available thinking levels. Returns `null` data if model doesn't support thinking. ```json {"type": "cycle_thinking_level"} ``` Response: ```json { "type": "response", "command": "cycle_thinking_level", "success": true, "data": {"level": "high"} } ``` ### Queue Modes #### set_steering_mode Control how steering messages (from `steer`) are delivered. ```json {"type": "set_steering_mode", "mode": "one-at-a-time"} ``` Modes: - `"all"`: Deliver all steering messages at the next interruption point - `"one-at-a-time"`: Deliver one steering message per interruption (default) Response: ```json {"type": "response", "command": "set_steering_mode", "success": true} ``` #### set_follow_up_mode Control how follow-up messages (from `follow_up`) are delivered. ```json {"type": "set_follow_up_mode", "mode": "one-at-a-time"} ``` Modes: - `"all"`: Deliver all follow-up messages when agent finishes - `"one-at-a-time"`: Deliver one follow-up message per agent completion (default) Response: ```json {"type": "response", "command": "set_follow_up_mode", "success": true} ``` ### Compaction #### compact Manually compact conversation context to reduce token usage. ```json {"type": "compact"} ``` With custom instructions: ```json {"type": "compact", "customInstructions": "Focus on code changes"} ``` Response: ```json { "type": "response", "command": "compact", "success": true, "data": { "summary": "Summary of conversation...", "firstKeptEntryId": "abc123", "tokensBefore": 150000, "details": {} } } ``` #### set_auto_compaction Enable or disable automatic compaction when context is nearly full. ```json {"type": "set_auto_compaction", "enabled": true} ``` Response: ```json {"type": "response", "command": "set_auto_compaction", "success": true} ``` ### Retry #### set_auto_retry Enable or disable automatic retry on transient errors (overloaded, rate limit, 5xx). ```json {"type": "set_auto_retry", "enabled": true} ``` Response: ```json {"type": "response", "command": "set_auto_retry", "success": true} ``` #### abort_retry Abort an in-progress retry (cancel the delay and stop retrying). ```json {"type": "abort_retry"} ``` Response: ```json {"type": "response", "command": "abort_retry", "success": true} ``` ### Bash #### bash Execute a shell command and add output to conversation context. ```json {"type": "bash", "command": "ls -la"} ``` Response: ```json { "type": "response", "command": "bash", "success": true, "data": { "output": "total 48\ndrwxr-xr-x ...", "exitCode": 0, "cancelled": false, "truncated": false } } ``` If output was truncated, includes `fullOutputPath`: ```json { "type": "response", "command": "bash", "success": true, "data": { "output": "truncated output...", "exitCode": 0, "cancelled": false, "truncated": true, "fullOutputPath": "/tmp/pi-bash-abc123.log" } } ``` **How bash results reach the LLM:** The `bash` command executes immediately and returns a `BashResult`. Internally, a `BashExecutionMessage` is created and stored in the agent's message state. This message does NOT emit an event. When the next `prompt` command is sent, all messages (including `BashExecutionMessage`) are transformed before being sent to the LLM. The `BashExecutionMessage` is converted to a `UserMessage` with this format: ``` Ran `ls -la` \`\`\` total 48 drwxr-xr-x ... \`\`\` ``` This means: 1. Bash output is included in the LLM context on the **next prompt**, not immediately 2. Multiple bash commands can be executed before a prompt; all outputs will be included 3. No event is emitted for the `BashExecutionMessage` itself #### abort_bash Abort a running bash command. ```json {"type": "abort_bash"} ``` Response: ```json {"type": "response", "command": "abort_bash", "success": true} ``` ### Session #### get_session_stats Get token usage and cost statistics. ```json {"type": "get_session_stats"} ``` Response: ```json { "type": "response", "command": "get_session_stats", "success": true, "data": { "sessionFile": "/path/to/session.jsonl", "sessionId": "abc123", "userMessages": 5, "assistantMessages": 5, "toolCalls": 12, "toolResults": 12, "totalMessages": 22, "tokens": { "input": 50000, "output": 10000, "cacheRead": 40000, "cacheWrite": 5000, "total": 105000 }, "cost": 0.45 } } ``` #### export_html Export session to an HTML file. ```json {"type": "export_html"} ``` With custom path: ```json {"type": "export_html", "outputPath": "/tmp/session.html"} ``` Response: ```json { "type": "response", "command": "export_html", "success": true, "data": {"path": "/tmp/session.html"} } ``` #### switch_session Load a different session file. Can be cancelled by a `session_before_switch` extension event handler. ```json {"type": "switch_session", "sessionPath": "/path/to/session.jsonl"} ``` Response: ```json {"type": "response", "command": "switch_session", "success": true, "data": {"cancelled": false}} ``` If an extension cancelled the switch: ```json {"type": "response", "command": "switch_session", "success": true, "data": {"cancelled": true}} ``` #### fork Create a new fork from a previous user message. Can be cancelled by a `session_before_fork` extension event handler. Returns the text of the message being forked from. ```json {"type": "fork", "entryId": "abc123"} ``` Response: ```json { "type": "response", "command": "fork", "success": true, "data": {"text": "The original prompt text...", "cancelled": false} } ``` If an extension cancelled the fork: ```json { "type": "response", "command": "fork", "success": true, "data": {"text": "The original prompt text...", "cancelled": true} } ``` #### get_fork_messages Get user messages available for forking. ```json {"type": "get_fork_messages"} ``` Response: ```json { "type": "response", "command": "get_fork_messages", "success": true, "data": { "messages": [ {"entryId": "abc123", "text": "First prompt..."}, {"entryId": "def456", "text": "Second prompt..."} ] } } ``` #### get_last_assistant_text Get the text content of the last assistant message. ```json {"type": "get_last_assistant_text"} ``` Response: ```json { "type": "response", "command": "get_last_assistant_text", "success": true, "data": {"text": "The assistant's response..."} } ``` Returns `{"text": null}` if no assistant messages exist. ## Events Events are streamed to stdout as JSON lines during agent operation. Events do NOT include an `id` field (only responses do). ### Event Types | Event | Description | |-------|-------------| | `agent_start` | Agent begins processing | | `agent_end` | Agent completes (includes all generated messages) | | `turn_start` | New turn begins | | `turn_end` | Turn completes (includes assistant message and tool results) | | `message_start` | Message begins | | `message_update` | Streaming update (text/thinking/toolcall deltas) | | `message_end` | Message completes | | `tool_execution_start` | Tool begins execution | | `tool_execution_update` | Tool execution progress (streaming output) | | `tool_execution_end` | Tool completes | | `auto_compaction_start` | Auto-compaction begins | | `auto_compaction_end` | Auto-compaction completes | | `auto_retry_start` | Auto-retry begins (after transient error) | | `auto_retry_end` | Auto-retry completes (success or final failure) | | `extension_error` | Extension threw an error | ### agent_start Emitted when the agent begins processing a prompt. ```json {"type": "agent_start"} ``` ### agent_end Emitted when the agent completes. Contains all messages generated during this run. ```json { "type": "agent_end", "messages": [...] } ``` ### turn_start / turn_end A turn consists of one assistant response plus any resulting tool calls and results. ```json {"type": "turn_start"} ``` ```json { "type": "turn_end", "message": {...}, "toolResults": [...] } ``` ### message_start / message_end Emitted when a message begins and completes. The `message` field contains an `AgentMessage`. ```json {"type": "message_start", "message": {...}} {"type": "message_end", "message": {...}} ``` ### message_update (Streaming) Emitted during streaming of assistant messages. Contains both the partial message and a streaming delta event. ```json { "type": "message_update", "message": {...}, "assistantMessageEvent": { "type": "text_delta", "contentIndex": 0, "delta": "Hello ", "partial": {...} } } ``` The `assistantMessageEvent` field contains one of these delta types: | Type | Description | |------|-------------| | `start` | Message generation started | | `text_start` | Text content block started | | `text_delta` | Text content chunk | | `text_end` | Text content block ended | | `thinking_start` | Thinking block started | | `thinking_delta` | Thinking content chunk | | `thinking_end` | Thinking block ended | | `toolcall_start` | Tool call started | | `toolcall_delta` | Tool call arguments chunk | | `toolcall_end` | Tool call ended (includes full `toolCall` object) | | `done` | Message complete (reason: `"stop"`, `"length"`, `"toolUse"`) | | `error` | Error occurred (reason: `"aborted"`, `"error"`) | Example streaming a text response: ```json {"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_start","contentIndex":0,"partial":{...}}} {"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":"Hello","partial":{...}}} {"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":" world","partial":{...}}} {"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_end","contentIndex":0,"content":"Hello world","partial":{...}}} ``` ### tool_execution_start / tool_execution_update / tool_execution_end Emitted when a tool begins, streams progress, and completes execution. ```json { "type": "tool_execution_start", "toolCallId": "call_abc123", "toolName": "bash", "args": {"command": "ls -la"} } ``` During execution, `tool_execution_update` events stream partial results (e.g., bash output as it arrives): ```json { "type": "tool_execution_update", "toolCallId": "call_abc123", "toolName": "bash", "args": {"command": "ls -la"}, "partialResult": { "content": [{"type": "text", "text": "partial output so far..."}], "details": {"truncation": null, "fullOutputPath": null} } } ``` When complete: ```json { "type": "tool_execution_end", "toolCallId": "call_abc123", "toolName": "bash", "result": { "content": [{"type": "text", "text": "total 48\n..."}], "details": {...} }, "isError": false } ``` Use `toolCallId` to correlate events. The `partialResult` in `tool_execution_update` contains the accumulated output so far (not just the delta), allowing clients to simply replace their display on each update. ### auto_compaction_start / auto_compaction_end Emitted when automatic compaction runs (when context is nearly full). ```json {"type": "auto_compaction_start", "reason": "threshold"} ``` The `reason` field is `"threshold"` (context getting large) or `"overflow"` (context exceeded limit). ```json { "type": "auto_compaction_end", "result": { "summary": "Summary of conversation...", "firstKeptEntryId": "abc123", "tokensBefore": 150000, "details": {} }, "aborted": false, "willRetry": false } ``` If `reason` was `"overflow"` and compaction succeeds, `willRetry` is `true` and the agent will automatically retry the prompt. If compaction was aborted, `result` is `null` and `aborted` is `true`. If compaction failed (e.g., API quota exceeded), `result` is `null`, `aborted` is `false`, and `errorMessage` contains the error description. ### auto_retry_start / auto_retry_end Emitted when automatic retry is triggered after a transient error (overloaded, rate limit, 5xx). ```json { "type": "auto_retry_start", "attempt": 1, "maxAttempts": 3, "delayMs": 2000, "errorMessage": "529 {\"type\":\"error\",\"error\":{\"type\":\"overloaded_error\",\"message\":\"Overloaded\"}}" } ``` ```json { "type": "auto_retry_end", "success": true, "attempt": 2 } ``` On final failure (max retries exceeded): ```json { "type": "auto_retry_end", "success": false, "attempt": 3, "finalError": "529 overloaded_error: Overloaded" } ``` ### extension_error Emitted when an extension throws an error. ```json { "type": "extension_error", "extensionPath": "/path/to/extension.ts", "event": "tool_call", "error": "Error message..." } ``` ## Error Handling Failed commands return a response with `success: false`: ```json { "type": "response", "command": "set_model", "success": false, "error": "Model not found: invalid/model" } ``` Parse errors: ```json { "type": "response", "command": "parse", "success": false, "error": "Failed to parse command: Unexpected token..." } ``` ## Types Source files: - [`packages/ai/src/types.ts`](../../ai/src/types.ts) - `Model`, `UserMessage`, `AssistantMessage`, `ToolResultMessage` - [`packages/agent/src/types.ts`](../../agent/src/types.ts) - `AgentMessage`, `AgentEvent` - [`src/core/messages.ts`](../src/core/messages.ts) - `BashExecutionMessage` - [`src/modes/rpc/rpc-types.ts`](../src/modes/rpc/rpc-types.ts) - RPC command/response types ### Model ```json { "id": "claude-sonnet-4-20250514", "name": "Claude Sonnet 4", "api": "anthropic-messages", "provider": "anthropic", "baseUrl": "https://api.anthropic.com", "reasoning": true, "input": ["text", "image"], "contextWindow": 200000, "maxTokens": 16384, "cost": { "input": 3.0, "output": 15.0, "cacheRead": 0.3, "cacheWrite": 3.75 } } ``` ### UserMessage ```json { "role": "user", "content": "Hello!", "timestamp": 1733234567890, "attachments": [] } ``` The `content` field can be a string or an array of `TextContent`/`ImageContent` blocks. ### AssistantMessage ```json { "role": "assistant", "content": [ {"type": "text", "text": "Hello! How can I help?"}, {"type": "thinking", "thinking": "User is greeting me..."}, {"type": "toolCall", "id": "call_123", "name": "bash", "arguments": {"command": "ls"}} ], "api": "anthropic-messages", "provider": "anthropic", "model": "claude-sonnet-4-20250514", "usage": { "input": 100, "output": 50, "cacheRead": 0, "cacheWrite": 0, "cost": {"input": 0.0003, "output": 0.00075, "cacheRead": 0, "cacheWrite": 0, "total": 0.00105} }, "stopReason": "stop", "timestamp": 1733234567890 } ``` Stop reasons: `"stop"`, `"length"`, `"toolUse"`, `"error"`, `"aborted"` ### ToolResultMessage ```json { "role": "toolResult", "toolCallId": "call_123", "toolName": "bash", "content": [{"type": "text", "text": "total 48\ndrwxr-xr-x ..."}], "isError": false, "timestamp": 1733234567890 } ``` ### BashExecutionMessage Created by the `bash` RPC command (not by LLM tool calls): ```json { "role": "bashExecution", "command": "ls -la", "output": "total 48\ndrwxr-xr-x ...", "exitCode": 0, "cancelled": false, "truncated": false, "fullOutputPath": null, "timestamp": 1733234567890 } ``` ### Attachment ```json { "id": "img1", "type": "image", "fileName": "photo.jpg", "mimeType": "image/jpeg", "size": 102400, "content": "base64-encoded-data...", "extractedText": null, "preview": null } ``` ## Example: Basic Client (Python) ```python import subprocess import json proc = subprocess.Popen( ["pi", "--mode", "rpc", "--no-session"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True ) def send(cmd): proc.stdin.write(json.dumps(cmd) + "\n") proc.stdin.flush() def read_events(): for line in proc.stdout: yield json.loads(line) # Send prompt send({"type": "prompt", "message": "Hello!"}) # Process events for event in read_events(): if event.get("type") == "message_update": delta = event.get("assistantMessageEvent", {}) if delta.get("type") == "text_delta": print(delta["delta"], end="", flush=True) if event.get("type") == "agent_end": print() break ``` ## Example: Interactive Client (Node.js) See [`test/rpc-example.ts`](../test/rpc-example.ts) for a complete interactive example, or [`src/modes/rpc/rpc-client.ts`](../src/modes/rpc/rpc-client.ts) for a typed client implementation. ```javascript const { spawn } = require("child_process"); const readline = require("readline"); const agent = spawn("pi", ["--mode", "rpc", "--no-session"]); readline.createInterface({ input: agent.stdout }).on("line", (line) => { const event = JSON.parse(line); if (event.type === "message_update") { const { assistantMessageEvent } = event; if (assistantMessageEvent.type === "text_delta") { process.stdout.write(assistantMessageEvent.delta); } } }); // Send prompt agent.stdin.write(JSON.stringify({ type: "prompt", message: "Hello" }) + "\n"); // Abort on Ctrl+C process.on("SIGINT", () => { agent.stdin.write(JSON.stringify({ type: "abort" }) + "\n"); }); ```