# OpenCode Research Research notes on OpenCode's configuration, credential discovery, and runtime behavior based on agent-jj implementation. ## Overview - **Provider**: Multi-provider (OpenAI, Anthropic, others) - **Execution Method**: Embedded server via SDK, or CLI binary - **Session Persistence**: Session ID (string) - **SDK**: `@opencode-ai/sdk` (server + client) - **Binary Location**: `~/.opencode/bin/opencode` - **Written in**: Go (with Bubble Tea TUI) ## CLI Usage (Alternative to SDK) OpenCode can be used as a standalone binary instead of embedding the SDK: ### Interactive TUI Mode ```bash opencode # Start TUI in current directory opencode /path/to/project # Start in specific directory opencode -c # Continue last session opencode -s SESSION_ID # Continue specific session ``` ### Non-Interactive Mode (`opencode run`) ```bash opencode run "your prompt here" opencode run --format json "prompt" # Raw JSON events output opencode run -m anthropic/claude-sonnet-4-20250514 "prompt" opencode run --agent plan "analyze this code" opencode run -c "follow up question" # Continue last session opencode run -s SESSION_ID "prompt" # Continue specific session opencode run -f file1.ts -f file2.ts "review these files" ``` ### Key CLI Flags | Flag | Description | |------|-------------| | `--format json` | Output raw JSON events (for parsing) | | `-m, --model PROVIDER/MODEL` | Model in format `provider/model` | | `--agent AGENT` | Agent to use (`build`, `plan`) | | `-c, --continue` | Continue last session | | `-s, --session ID` | Continue specific session | | `-f, --file FILE` | Attach file(s) to message | | `--attach URL` | Attach to running server | | `--port PORT` | Local server port | | `--variant VARIANT` | Reasoning effort (e.g., `high`, `max`) | ### Headless Server Mode ```bash opencode serve # Start headless server opencode serve --port 4096 # Specific port opencode attach http://localhost:4096 # Attach to running server ``` ### Other Commands ```bash opencode models # List available models opencode models anthropic # List models for provider opencode auth # Manage credentials opencode session # Manage sessions opencode export SESSION_ID # Export session as JSON opencode stats # Token usage statistics ``` Sources: [OpenCode GitHub](https://github.com/opencode-ai/opencode), [OpenCode Docs](https://opencode.ai/docs/cli/) ## Architecture OpenCode runs as an embedded HTTP server per workspace/change: ``` ┌─────────────────────┐ │ agent-jj backend │ │ │ │ ┌───────────────┐ │ │ │ OpenCode │ │ │ │ Server │◄─┼── HTTP API │ │ (per change) │ │ │ └───────────────┘ │ └─────────────────────┘ ``` - One server per `changeId` (workspace+repo+change combination) - Multiple sessions can share a server - Server runs on dynamic port (4200-4300 range) ## Credential Discovery ### Priority Order 1. Environment variables: `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY` 2. Environment variables: `OPENAI_API_KEY`, `CODEX_API_KEY` 3. Claude Code config files 4. Codex config files 5. OpenCode config files ### Config File Location | Path | Description | |------|-------------| | `~/.local/share/opencode/auth.json` | Primary auth config | ### Auth File Structure ```json { "anthropic": { "type": "api", "key": "sk-ant-..." }, "openai": { "type": "api", "key": "sk-..." }, "custom-provider": { "type": "oauth", "access": "token...", "refresh": "refresh-token...", "expires": 1704067200000 } } ``` ### Provider Config Types ```typescript interface OpenCodeProviderConfig { type: "api" | "oauth"; key?: string; // For API type access?: string; // For OAuth type refresh?: string; // For OAuth type expires?: number; // Unix timestamp (ms) } ``` OAuth tokens are validated for expiry before use. ## Server Management ### Starting a Server ```typescript import { createOpencodeServer } from "@opencode-ai/sdk/server"; import { createOpencodeClient } from "@opencode-ai/sdk"; const server = await createOpencodeServer({ hostname: "127.0.0.1", port: 4200, config: { logLevel: "DEBUG" } }); const client = createOpencodeClient({ baseUrl: `http://127.0.0.1:${port}` }); ``` ### Server Configuration ```typescript // From config.json { "opencode": { "host": "127.0.0.1", // Bind address "advertisedHost": "127.0.0.1" // External address (for tunnels) } } ``` ### Port Selection Uses `get-port` package to find available port in range 4200-4300. ## Client API ### Session Management ```typescript // Create session const response = await client.session.create({}); const sessionId = response.data.id; // Get session info const session = await client.session.get({ path: { id: sessionId } }); // Get session messages const messages = await client.session.messages({ path: { id: sessionId } }); // Get session todos const todos = await client.session.todo({ path: { id: sessionId } }); ``` ### Sending Prompts #### Synchronous ```typescript const response = await client.session.prompt({ path: { id: sessionId }, body: { model: { providerID: "openai", modelID: "gpt-4o" }, agent: "build", parts: [{ type: "text", text: "prompt text" }] } }); ``` #### Asynchronous (Streaming) ```typescript // Start prompt asynchronously await client.session.promptAsync({ path: { id: sessionId }, body: { model: { providerID: "openai", modelID: "gpt-4o" }, agent: "build", parts: [{ type: "text", text: "prompt text" }] } }); // Subscribe to events const eventStream = await client.event.subscribe({}); for await (const event of eventStream.stream) { // Process events } ``` ## Event Types | Event Type | Description | |------------|-------------| | `message.part.updated` | Message part streamed/updated | | `session.status` | Session status changed | | `session.idle` | Session finished processing | | `session.error` | Session error occurred | | `question.asked` | AI asking user question | | `permission.asked` | AI requesting permission | ### Event Structure ```typescript interface SDKEvent { type: string; properties: { part?: SDKPart & { sessionID?: string }; delta?: string; // Text delta for streaming status?: { type?: string }; sessionID?: string; error?: { data?: { message?: string } }; id?: string; questions?: QuestionInfo[]; permission?: string; patterns?: string[]; metadata?: Record; always?: string[]; tool?: { messageID?: string; callID?: string }; }; } ``` ## Message Parts OpenCode has rich message part types: | Type | Description | |------|-------------| | `text` | Plain text content | | `reasoning` | Model reasoning (chain-of-thought) | | `tool` | Tool invocation with status | | `file` | File reference | | `step-start` | Step boundary start | | `step-finish` | Step boundary end with reason | | `subtask` | Delegated subtask | ### Part Structure ```typescript interface MessagePart { type: "text" | "reasoning" | "tool" | "file" | "step-start" | "step-finish" | "subtask" | "other"; id: string; content: string; // Tool-specific toolName?: string; toolStatus?: "pending" | "running" | "completed" | "error"; toolInput?: Record; toolOutput?: string; // File-specific filename?: string; mimeType?: string; // Step-specific stepReason?: string; // Subtask-specific subtaskAgent?: string; subtaskDescription?: string; } ``` ## Questions and Permissions ### Question Request ```typescript interface QuestionRequest { id: string; sessionID: string; questions: Array<{ header?: string; question: string; options: Array<{ label: string; description?: string }>; multiSelect?: boolean; }>; tool?: { messageID: string; callID: string }; } ``` ### Responding to Questions ```typescript // V1 client for question/permission APIs const clientV1 = createOpencodeClientV1({ baseUrl: `http://127.0.0.1:${port}` }); // Reply with answers await clientV1.question.reply({ requestID: requestId, answers: [["selected option"]] // Array of selected labels per question }); // Reject question await clientV1.question.reject({ requestID: requestId }); ``` ### Permission Request ```typescript interface PermissionRequest { id: string; sessionID: string; permission: string; // Permission type (e.g., "file:write") patterns: string[]; // Affected paths/patterns metadata: Record; always: string[]; // Options for "always allow" tool?: { messageID: string; callID: string }; } ``` ### Responding to Permissions ```typescript await clientV1.permission.reply({ requestID: requestId, reply: "once" | "always" | "reject" }); ``` ## Provider/Model Discovery ```typescript // Get available providers and models const providerResponse = await client.provider.list({}); const agentResponse = await client.app.agents({}); interface ProviderInfo { id: string; name: string; models: Array<{ id: string; name: string; reasoning: boolean; toolCall: boolean; }>; } interface AgentInfo { id: string; name: string; primary: boolean; // "build" and "plan" are primary } ``` ### Internal Agents (Hidden from UI) - `compaction` - `title` - `summary` ## Token Usage ```typescript interface TokenUsage { input: number; output: number; reasoning?: number; cache?: { read: number; write: number; }; } ``` Available in message `info` field for assistant messages. ## Agent Modes vs Permission Modes OpenCode properly separates these concepts: ### Agent Modes Agents are first-class concepts with their own system prompts and behavior: | Agent ID | Description | |----------|-------------| | `build` | Default execution agent | | `plan` | Planning/analysis agent | | Custom | User-defined agents in config | ```typescript // Sending a prompt with specific agent await client.session.promptAsync({ body: { agent: "plan", // or "build", or custom agent ID parts: [{ type: "text", text: "..." }] } }); ``` ### Listing Available Agents ```typescript const agents = await client.app.agents({}); // Returns: [{ id: "build", name: "Build", primary: true }, ...] ``` ### Permission Modes Permissions are configured via rulesets on the session, separate from agent selection: ```typescript interface PermissionRuleset { // Tool-specific permission rules } ``` ### Human-in-the-Loop OpenCode has full interactive HITL via SSE events: | Event | Endpoint | |-------|----------| | `question.asked` | `POST /question/{id}/reply` | | `permission.asked` | `POST /permission/{id}/reply` | See `research/human-in-the-loop.md` for full API details. ## Defaults ```typescript const DEFAULT_OPENCODE_MODEL_ID = "gpt-4o"; const DEFAULT_OPENCODE_PROVIDER_ID = "openai"; ``` ## Concurrency Control Server startup uses a lock to prevent race conditions: ```typescript async function withStartLock(fn: () => Promise): Promise { const prior = startLock; let release: () => void; startLock = new Promise((resolve) => { release = resolve; }); await prior; try { return await fn(); } finally { release(); } } ``` ## Working Directory Server must be started in the correct working directory: ```typescript async function withWorkingDir(workingDir: string, fn: () => Promise): Promise { const previous = process.cwd(); process.chdir(workingDir); try { return await fn(); } finally { process.chdir(previous); } } ``` ## Polling Fallback A polling mechanism checks session status every 2 seconds in case SSE events don't arrive: ```typescript const pollInterval = setInterval(async () => { const session = await client.session.get({ path: { id: sessionId } }); if (session.data?.status?.type === "idle") { abortController.abort(); } }, 2000); ``` ## Model Discovery OpenCode has the richest model discovery support with both CLI and HTTP API. ### CLI Commands ```bash opencode models # List all available models opencode models # List models for a specific provider ``` ### HTTP Endpoint ``` GET /provider ``` ### Response Schema ```json { "all": [ { "id": "anthropic", "name": "Anthropic", "api": "string", "env": ["ANTHROPIC_API_KEY"], "npm": "string", "models": { "model-key": { "id": "string", "name": "string", "family": "string", "release_date": "string", "attachment": true, "reasoning": false, "tool_call": true, "cost": { "input": 0.003, "output": 0.015, "cache_read": 0.0003, "cache_write": 0.00375 }, "limit": { "context": 200000, "input": 200000, "output": 8192 }, "modalities": { "input": ["text", "image"], "output": ["text"] }, "experimental": false, "status": "beta" } } } ], "default": { "anthropic": "claude-sonnet-4-20250514" }, "connected": ["anthropic"] } ``` ### SDK Usage ```typescript const client = createOpencodeClient(); const response = await client.provider.list(); ``` ### How to Replicate When an OpenCode server is running, call `GET /provider` on its HTTP port. Returns full model metadata including capabilities, costs, context limits, and modalities. ## Command Execution & Process Management ### Agent Tool Execution The agent executes commands via internal tools (not exposed in the HTTP API). The agent's tool calls are synchronous within its turn. Tool parts have states: `pending`, `running`, `completed`, `error`. ### PTY System (`/pty/*`) - User-Facing Terminals Separate from the agent's command execution. PTYs are server-scoped interactive terminals for the user: - `POST /pty` - Create PTY (command, args, cwd, title, env) - `GET /pty` - List all PTYs - `GET /pty/{ptyID}` - Get PTY info - `PUT /pty/{ptyID}` - Update PTY (title, resize via `size: {rows, cols}`) - `DELETE /pty/{ptyID}` - Kill and remove PTY - `GET /pty/{ptyID}/connect` - WebSocket for bidirectional I/O PTY events (globally broadcast via SSE): `pty.created`, `pty.updated`, `pty.exited`, `pty.deleted`. The agent does NOT use the PTY system. PTYs are for the user's interactive terminal panel, independent of any AI session. ### Session Commands (`/session/{id}/command`, `/session/{id}/shell`) - Context Injection External clients can inject command results into an AI session's conversation context: - `POST /session/{sessionID}/command` - Executes a command and records the result as an `AssistantMessage` in the session. Required fields: `command`, `arguments`. The output becomes part of the AI's context for subsequent turns. - `POST /session/{sessionID}/shell` - Similar but wraps in `sh -c`. Required fields: `command`, `agent`. - `GET /command` - Lists available command definitions (metadata, not execution). Session commands emit `command.executed` events with `sessionID` + `messageID`. **Key distinction**: These endpoints execute commands directly (not via the AI), then inject the output into the session as if the AI produced it. The AI doesn't actively run the command - it just finds the output in its conversation history on the next turn. ### Three Separate Execution Mechanisms | Mechanism | Who uses it | Scoped to | AI sees output? | |-----------|-------------|-----------|----------------| | Agent tools (internal) | AI agent | Session turn | Yes (immediate) | | PTY (`/pty/*`) | User/frontend | Server (global) | No | | Session commands (`/session/{id}/*`) | Frontend/SDK client | Session | Yes (next turn) | The agent has no tool to interact with PTYs and cannot access the session command endpoints. When the agent needs to run a background process, it uses its internal bash-equivalent tool with shell backgrounding (`&`). ### Comparison | Capability | Supported? | Notes | |-----------|-----------|-------| | Agent runs commands | Yes (internal tools) | Synchronous, blocks agent turn | | User runs commands → agent sees output | Yes (`/session/{id}/command`) | HTTP API, first-class | | External API for command injection | Yes | Session-scoped endpoints | | Command source tracking | Implicit | Endpoint implies source (no enum) | | Background process management | No | Shell `&` only for agent | | PTY / interactive terminal | Yes (`/pty/*`) | Server-scoped, WebSocket I/O | ## Notes - OpenCode is the most feature-rich runtime (streaming, questions, permissions) - Server persists for the lifetime of a change (workspace+repo+change) - Parts are streamed incrementally with delta updates - V1 client is needed for question/permission APIs - Working directory affects credential discovery and file operations