co-mono/packages/coding-agent/docs/rpc.md
Mario Zechner 5a9d844f9a Simplify compaction: remove proactive abort, use Agent.continue() for retry
- Add agentLoopContinue() to pi-ai for resuming from existing context
- Add Agent.continue() method and transport.continue() interface
- Simplify AgentSession compaction to two cases: overflow (auto-retry) and threshold (no retry)
- Remove proactive mid-turn compaction abort
- Merge turn prefix summary into main summary
- Add isCompacting property to AgentSession and RPC state
- Block input during compaction in interactive mode
- Show compaction count on session resume
- Rename RPC.md to rpc.md for consistency

Related to #128
2025-12-09 21:43:49 +01:00

18 KiB

RPC Mode

RPC mode enables headless operation of the coding agent via a JSON protocol over stdin/stdout. This is useful for embedding the agent in other applications, IDEs, or custom UIs.

Note for Node.js/TypeScript users: If you're building a Node.js application, consider using AgentSession directly from @mariozechner/pi-coding-agent instead of spawning a subprocess. See src/core/agent-session.ts for the API. For a subprocess-based TypeScript client, see src/modes/rpc/rpc-client.ts.

Starting RPC Mode

pi --mode rpc [options]

Common options:

  • --provider <name>: Set the LLM provider (anthropic, openai, google, etc.)
  • --model <id>: Set the model ID
  • --no-session: Disable session persistence
  • --session-dir <path>: Custom session storage directory

Protocol Overview

  • Commands: JSON objects sent to stdin, one per line
  • Responses: JSON objects with type: "response" indicating command success/failure
  • Events: Agent events streamed to stdout as JSON lines

All commands support an optional id field for request/response correlation. If provided, the corresponding response will include the same id.

Commands

Prompting

prompt

Send a user prompt to the agent. Returns immediately; events stream asynchronously.

{"id": "req-1", "type": "prompt", "message": "Hello, world!"}

With attachments:

{"type": "prompt", "message": "What's in this image?", "attachments": [...]}

Response:

{"id": "req-1", "type": "response", "command": "prompt", "success": true}

The attachments field is optional. See Attachments for the schema.

queue_message

Queue a message to be injected at the next agent turn. Queued messages are added to the conversation without triggering a new prompt. Useful for injecting context mid-conversation.

{"type": "queue_message", "message": "Additional context"}

Response:

{"type": "response", "command": "queue_message", "success": true}

See set_queue_mode for controlling how queued messages are processed.

abort

Abort the current agent operation.

{"type": "abort"}

Response:

{"type": "response", "command": "abort", "success": true}

reset

Clear context and start a fresh session.

{"type": "reset"}

Response:

{"type": "response", "command": "reset", "success": true}

State

get_state

Get current session state.

{"type": "get_state"}

Response:

{
  "type": "response",
  "command": "get_state",
  "success": true,
  "data": {
    "model": {...},
    "thinkingLevel": "medium",
    "isStreaming": false,
    "isCompacting": false,
    "queueMode": "all",
    "sessionFile": "/path/to/session.jsonl",
    "sessionId": "abc123",
    "autoCompactionEnabled": true,
    "messageCount": 5,
    "queuedMessageCount": 0
  }
}

The model field is a full Model object or null.

get_messages

Get all messages in the conversation.

{"type": "get_messages"}

Response:

{
  "type": "response",
  "command": "get_messages",
  "success": true,
  "data": {"messages": [...]}
}

Messages are AppMessage objects (see Message Types).

Model

set_model

Switch to a specific model.

{"type": "set_model", "provider": "anthropic", "modelId": "claude-sonnet-4-20250514"}

Response contains the full Model object:

{
  "type": "response",
  "command": "set_model",
  "success": true,
  "data": {...}
}

cycle_model

Cycle to the next available model. Returns null data if only one model available.

{"type": "cycle_model"}

Response:

{
  "type": "response",
  "command": "cycle_model",
  "success": true,
  "data": {
    "model": {...},
    "thinkingLevel": "medium",
    "isScoped": false
  }
}

The model field is a full Model object.

get_available_models

List all configured models.

{"type": "get_available_models"}

Response contains an array of full Model objects:

{
  "type": "response",
  "command": "get_available_models",
  "success": true,
  "data": {
    "models": [...]
  }
}

Thinking

set_thinking_level

Set the reasoning/thinking level for models that support it.

{"type": "set_thinking_level", "level": "high"}

Levels: "off", "minimal", "low", "medium", "high", "xhigh"

Note: "xhigh" is only supported by OpenAI codex-max models.

Response:

{"type": "response", "command": "set_thinking_level", "success": true}

cycle_thinking_level

Cycle through available thinking levels. Returns null data if model doesn't support thinking.

{"type": "cycle_thinking_level"}

Response:

{
  "type": "response",
  "command": "cycle_thinking_level",
  "success": true,
  "data": {"level": "high"}
}

Queue Mode

set_queue_mode

Control how queued messages (from queue_message) are injected into the conversation.

{"type": "set_queue_mode", "mode": "one-at-a-time"}

Modes:

  • "all": Inject all queued messages at the next turn
  • "one-at-a-time": Inject one queued message per turn (default)

Response:

{"type": "response", "command": "set_queue_mode", "success": true}

Compaction

compact

Manually compact conversation context to reduce token usage.

{"type": "compact"}

With custom instructions:

{"type": "compact", "customInstructions": "Focus on code changes"}

Response:

{
  "type": "response",
  "command": "compact",
  "success": true,
  "data": {
    "tokensBefore": 150000,
    "summary": "Summary of conversation..."
  }
}

set_auto_compaction

Enable or disable automatic compaction when context is nearly full.

{"type": "set_auto_compaction", "enabled": true}

Response:

{"type": "response", "command": "set_auto_compaction", "success": true}

Bash

bash

Execute a shell command and add output to conversation context.

{"type": "bash", "command": "ls -la"}

Response:

{
  "type": "response",
  "command": "bash",
  "success": true,
  "data": {
    "output": "total 48\ndrwxr-xr-x ...",
    "exitCode": 0,
    "cancelled": false,
    "truncated": false
  }
}

If output was truncated, includes fullOutputPath:

{
  "type": "response",
  "command": "bash",
  "success": true,
  "data": {
    "output": "truncated output...",
    "exitCode": 0,
    "cancelled": false,
    "truncated": true,
    "fullOutputPath": "/tmp/pi-bash-abc123.log"
  }
}

How bash results reach the LLM:

The bash command executes immediately and returns a BashResult. Internally, a BashExecutionMessage is created and stored in the agent's message state. This message does NOT emit an event.

When the next prompt command is sent, all messages (including BashExecutionMessage) are transformed before being sent to the LLM. The BashExecutionMessage is converted to a UserMessage with this format:

Ran `ls -la`
\`\`\`
total 48
drwxr-xr-x ...
\`\`\`

This means:

  1. Bash output is included in the LLM context on the next prompt, not immediately
  2. Multiple bash commands can be executed before a prompt; all outputs will be included
  3. No event is emitted for the BashExecutionMessage itself

abort_bash

Abort a running bash command.

{"type": "abort_bash"}

Response:

{"type": "response", "command": "abort_bash", "success": true}

Session

get_session_stats

Get token usage and cost statistics.

{"type": "get_session_stats"}

Response:

{
  "type": "response",
  "command": "get_session_stats",
  "success": true,
  "data": {
    "sessionFile": "/path/to/session.jsonl",
    "sessionId": "abc123",
    "userMessages": 5,
    "assistantMessages": 5,
    "toolCalls": 12,
    "toolResults": 12,
    "totalMessages": 22,
    "tokens": {
      "input": 50000,
      "output": 10000,
      "cacheRead": 40000,
      "cacheWrite": 5000,
      "total": 105000
    },
    "cost": 0.45
  }
}

export_html

Export session to an HTML file.

{"type": "export_html"}

With custom path:

{"type": "export_html", "outputPath": "/tmp/session.html"}

Response:

{
  "type": "response",
  "command": "export_html",
  "success": true,
  "data": {"path": "/tmp/session.html"}
}

switch_session

Load a different session file.

{"type": "switch_session", "sessionPath": "/path/to/session.jsonl"}

Response:

{"type": "response", "command": "switch_session", "success": true}

branch

Create a new branch from a previous user message. Returns the text of the message being branched from.

{"type": "branch", "entryIndex": 2}

Response:

{
  "type": "response",
  "command": "branch",
  "success": true,
  "data": {"text": "The original prompt text..."}
}

get_branch_messages

Get user messages available for branching.

{"type": "get_branch_messages"}

Response:

{
  "type": "response",
  "command": "get_branch_messages",
  "success": true,
  "data": {
    "messages": [
      {"entryIndex": 0, "text": "First prompt..."},
      {"entryIndex": 2, "text": "Second prompt..."}
    ]
  }
}

get_last_assistant_text

Get the text content of the last assistant message.

{"type": "get_last_assistant_text"}

Response:

{
  "type": "response",
  "command": "get_last_assistant_text",
  "success": true,
  "data": {"text": "The assistant's response..."}
}

Returns {"text": null} if no assistant messages exist.

Events

Events are streamed to stdout as JSON lines during agent operation. Events do NOT include an id field (only responses do).

Event Types

Event Description
agent_start Agent begins processing
agent_end Agent completes (includes all generated messages)
turn_start New turn begins
turn_end Turn completes (includes assistant message and tool results)
message_start Message begins
message_update Streaming update (text/thinking/toolcall deltas)
message_end Message completes
tool_execution_start Tool begins execution
tool_execution_end Tool completes
auto_compaction_start Auto-compaction begins
auto_compaction_end Auto-compaction completes

agent_start

Emitted when the agent begins processing a prompt.

{"type": "agent_start"}

agent_end

Emitted when the agent completes. Contains all messages generated during this run.

{
  "type": "agent_end",
  "messages": [...]
}

turn_start / turn_end

A turn consists of one assistant response plus any resulting tool calls and results.

{"type": "turn_start"}
{
  "type": "turn_end",
  "message": {...},
  "toolResults": [...]
}

message_start / message_end

Emitted when a message begins and completes. The message field contains an AppMessage.

{"type": "message_start", "message": {...}}
{"type": "message_end", "message": {...}}

message_update (Streaming)

Emitted during streaming of assistant messages. Contains both the partial message and a streaming delta event.

{
  "type": "message_update",
  "message": {...},
  "assistantMessageEvent": {
    "type": "text_delta",
    "contentIndex": 0,
    "delta": "Hello ",
    "partial": {...}
  }
}

The assistantMessageEvent field contains one of these delta types:

Type Description
start Message generation started
text_start Text content block started
text_delta Text content chunk
text_end Text content block ended
thinking_start Thinking block started
thinking_delta Thinking content chunk
thinking_end Thinking block ended
toolcall_start Tool call started
toolcall_delta Tool call arguments chunk
toolcall_end Tool call ended (includes full toolCall object)
done Message complete (reason: "stop", "length", "toolUse")
error Error occurred (reason: "aborted", "error")

Example streaming a text response:

{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_start","contentIndex":0,"partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":"Hello","partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_delta","contentIndex":0,"delta":" world","partial":{...}}}
{"type":"message_update","message":{...},"assistantMessageEvent":{"type":"text_end","contentIndex":0,"content":"Hello world","partial":{...}}}

tool_execution_start / tool_execution_end

Emitted when a tool begins and completes execution.

{
  "type": "tool_execution_start",
  "toolCallId": "call_abc123",
  "toolName": "bash",
  "args": {"command": "ls -la"}
}
{
  "type": "tool_execution_end",
  "toolCallId": "call_abc123",
  "toolName": "bash",
  "result": {
    "content": [{"type": "text", "text": "total 48\n..."}],
    "details": {...}
  },
  "isError": false
}

Use toolCallId to correlate tool_execution_start with tool_execution_end.

auto_compaction_start / auto_compaction_end

Emitted when automatic compaction runs (when context is nearly full).

{"type": "auto_compaction_start"}
{
  "type": "auto_compaction_end",
  "result": {
    "tokensBefore": 150000,
    "summary": "Summary of conversation..."
  },
  "aborted": false
}

If compaction was aborted, result is null and aborted is true.

Error Handling

Failed commands return a response with success: false:

{
  "type": "response",
  "command": "set_model",
  "success": false,
  "error": "Model not found: invalid/model"
}

Parse errors:

{
  "type": "response",
  "command": "parse",
  "success": false,
  "error": "Failed to parse command: Unexpected token..."
}

Types

Source files:

Model

{
  "id": "claude-sonnet-4-20250514",
  "name": "Claude Sonnet 4",
  "api": "anthropic-messages",
  "provider": "anthropic",
  "baseUrl": "https://api.anthropic.com",
  "reasoning": true,
  "input": ["text", "image"],
  "contextWindow": 200000,
  "maxTokens": 16384,
  "cost": {
    "input": 3.0,
    "output": 15.0,
    "cacheRead": 0.3,
    "cacheWrite": 3.75
  }
}

UserMessage

{
  "role": "user",
  "content": "Hello!",
  "timestamp": 1733234567890,
  "attachments": []
}

The content field can be a string or an array of TextContent/ImageContent blocks.

AssistantMessage

{
  "role": "assistant",
  "content": [
    {"type": "text", "text": "Hello! How can I help?"},
    {"type": "thinking", "thinking": "User is greeting me..."},
    {"type": "toolCall", "id": "call_123", "name": "bash", "arguments": {"command": "ls"}}
  ],
  "api": "anthropic-messages",
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "usage": {
    "input": 100,
    "output": 50,
    "cacheRead": 0,
    "cacheWrite": 0,
    "cost": {"input": 0.0003, "output": 0.00075, "cacheRead": 0, "cacheWrite": 0, "total": 0.00105}
  },
  "stopReason": "stop",
  "timestamp": 1733234567890
}

Stop reasons: "stop", "length", "toolUse", "error", "aborted"

ToolResultMessage

{
  "role": "toolResult",
  "toolCallId": "call_123",
  "toolName": "bash",
  "content": [{"type": "text", "text": "total 48\ndrwxr-xr-x ..."}],
  "isError": false,
  "timestamp": 1733234567890
}

BashExecutionMessage

Created by the bash RPC command (not by LLM tool calls):

{
  "role": "bashExecution",
  "command": "ls -la",
  "output": "total 48\ndrwxr-xr-x ...",
  "exitCode": 0,
  "cancelled": false,
  "truncated": false,
  "fullOutputPath": null,
  "timestamp": 1733234567890
}

Attachment

{
  "id": "img1",
  "type": "image",
  "fileName": "photo.jpg",
  "mimeType": "image/jpeg",
  "size": 102400,
  "content": "base64-encoded-data...",
  "extractedText": null,
  "preview": null
}

Example: Basic Client (Python)

import subprocess
import json

proc = subprocess.Popen(
    ["pi", "--mode", "rpc", "--no-session"],
    stdin=subprocess.PIPE,
    stdout=subprocess.PIPE,
    text=True
)

def send(cmd):
    proc.stdin.write(json.dumps(cmd) + "\n")
    proc.stdin.flush()

def read_events():
    for line in proc.stdout:
        yield json.loads(line)

# Send prompt
send({"type": "prompt", "message": "Hello!"})

# Process events
for event in read_events():
    if event.get("type") == "message_update":
        delta = event.get("assistantMessageEvent", {})
        if delta.get("type") == "text_delta":
            print(delta["delta"], end="", flush=True)
    
    if event.get("type") == "agent_end":
        print()
        break

Example: Interactive Client (Node.js)

See test/rpc-example.ts for a complete interactive example, or src/modes/rpc/rpc-client.ts for a typed client implementation.

const { spawn } = require("child_process");
const readline = require("readline");

const agent = spawn("pi", ["--mode", "rpc", "--no-session"]);

readline.createInterface({ input: agent.stdout }).on("line", (line) => {
    const event = JSON.parse(line);
    
    if (event.type === "message_update") {
        const { assistantMessageEvent } = event;
        if (assistantMessageEvent.type === "text_delta") {
            process.stdout.write(assistantMessageEvent.delta);
        }
    }
});

// Send prompt
agent.stdin.write(JSON.stringify({ type: "prompt", message: "Hello" }) + "\n");

// Abort on Ctrl+C
process.on("SIGINT", () => {
    agent.stdin.write(JSON.stringify({ type: "abort" }) + "\n");
});