sandbox-agent/research/agents/opencode.md
2026-02-11 07:57:02 -08:00

17 KiB

OpenCode Research

Research notes on OpenCode's configuration, credential discovery, and runtime behavior based on agent-jj implementation.

Overview

  • Provider: Multi-provider (OpenAI, Anthropic, others)
  • Execution Method: Embedded server via SDK, or CLI binary
  • Session Persistence: Session ID (string)
  • SDK: @opencode-ai/sdk (server + client)
  • Binary Location: ~/.opencode/bin/opencode
  • Written in: Go (with Bubble Tea TUI)

CLI Usage (Alternative to SDK)

OpenCode can be used as a standalone binary instead of embedding the SDK:

Interactive TUI Mode

opencode                      # Start TUI in current directory
opencode /path/to/project     # Start in specific directory
opencode -c                   # Continue last session
opencode -s SESSION_ID        # Continue specific session

Non-Interactive Mode (opencode run)

opencode run "your prompt here"
opencode run --format json "prompt"   # Raw JSON events output
opencode run -m anthropic/claude-sonnet-4-20250514 "prompt"
opencode run --agent plan "analyze this code"
opencode run -c "follow up question"  # Continue last session
opencode run -s SESSION_ID "prompt"   # Continue specific session
opencode run -f file1.ts -f file2.ts "review these files"

Key CLI Flags

Flag Description
--format json Output raw JSON events (for parsing)
-m, --model PROVIDER/MODEL Model in format provider/model
--agent AGENT Agent to use (build, plan)
-c, --continue Continue last session
-s, --session ID Continue specific session
-f, --file FILE Attach file(s) to message
--attach URL Attach to running server
--port PORT Local server port
--variant VARIANT Reasoning effort (e.g., high, max)

Headless Server Mode

opencode serve                        # Start headless server
opencode serve --port 4096            # Specific port
opencode attach http://localhost:4096 # Attach to running server

Other Commands

opencode models                 # List available models
opencode models anthropic       # List models for provider
opencode auth                   # Manage credentials
opencode session                # Manage sessions
opencode export SESSION_ID      # Export session as JSON
opencode stats                  # Token usage statistics

Sources: OpenCode GitHub, OpenCode Docs

Architecture

OpenCode runs as an embedded HTTP server per workspace/change:

┌─────────────────────┐
│   agent-jj backend  │
│                     │
│  ┌───────────────┐  │
│  │ OpenCode      │  │
│  │ Server        │◄─┼── HTTP API
│  │ (per change)  │  │
│  └───────────────┘  │
└─────────────────────┘
  • One server per changeId (workspace+repo+change combination)
  • Multiple sessions can share a server
  • Server runs on dynamic port (4200-4300 range)

Credential Discovery

Priority Order

  1. Environment variables: ANTHROPIC_API_KEY, CLAUDE_API_KEY
  2. Environment variables: OPENAI_API_KEY, CODEX_API_KEY
  3. Claude Code config files
  4. Codex config files
  5. OpenCode config files

Config File Location

Path Description
~/.local/share/opencode/auth.json Primary auth config

Auth File Structure

{
  "anthropic": {
    "type": "api",
    "key": "sk-ant-..."
  },
  "openai": {
    "type": "api",
    "key": "sk-..."
  },
  "custom-provider": {
    "type": "oauth",
    "access": "token...",
    "refresh": "refresh-token...",
    "expires": 1704067200000
  }
}

Provider Config Types

interface OpenCodeProviderConfig {
  type: "api" | "oauth";
  key?: string;      // For API type
  access?: string;   // For OAuth type
  refresh?: string;  // For OAuth type
  expires?: number;  // Unix timestamp (ms)
}

OAuth tokens are validated for expiry before use.

Server Management

Starting a Server

import { createOpencodeServer } from "@opencode-ai/sdk/server";
import { createOpencodeClient } from "@opencode-ai/sdk";

const server = await createOpencodeServer({
  hostname: "127.0.0.1",
  port: 4200,
  config: { logLevel: "DEBUG" }
});

const client = createOpencodeClient({
  baseUrl: `http://127.0.0.1:${port}`
});

Server Configuration

// From config.json
{
  "opencode": {
    "host": "127.0.0.1",        // Bind address
    "advertisedHost": "127.0.0.1" // External address (for tunnels)
  }
}

Port Selection

Uses get-port package to find available port in range 4200-4300.

Client API

Session Management

// Create session
const response = await client.session.create({});
const sessionId = response.data.id;

// Get session info
const session = await client.session.get({ path: { id: sessionId } });

// Get session messages
const messages = await client.session.messages({ path: { id: sessionId } });

// Get session todos
const todos = await client.session.todo({ path: { id: sessionId } });

Sending Prompts

Synchronous

const response = await client.session.prompt({
  path: { id: sessionId },
  body: {
    model: { providerID: "openai", modelID: "gpt-4o" },
    agent: "build",
    parts: [{ type: "text", text: "prompt text" }]
  }
});

Asynchronous (Streaming)

// Start prompt asynchronously
await client.session.promptAsync({
  path: { id: sessionId },
  body: {
    model: { providerID: "openai", modelID: "gpt-4o" },
    agent: "build",
    parts: [{ type: "text", text: "prompt text" }]
  }
});

// Subscribe to events
const eventStream = await client.event.subscribe({});

for await (const event of eventStream.stream) {
  // Process events
}

Event Types

Event Type Description
message.part.updated Message part streamed/updated
session.status Session status changed
session.idle Session finished processing
session.error Session error occurred
question.asked AI asking user question
permission.asked AI requesting permission

Event Structure

interface SDKEvent {
  type: string;
  properties: {
    part?: SDKPart & { sessionID?: string };
    delta?: string;          // Text delta for streaming
    status?: { type?: string };
    sessionID?: string;
    error?: { data?: { message?: string } };
    id?: string;
    questions?: QuestionInfo[];
    permission?: string;
    patterns?: string[];
    metadata?: Record<string, unknown>;
    always?: string[];
    tool?: { messageID?: string; callID?: string };
  };
}

Message Parts

OpenCode has rich message part types:

Type Description
text Plain text content
reasoning Model reasoning (chain-of-thought)
tool Tool invocation with status
file File reference
step-start Step boundary start
step-finish Step boundary end with reason
subtask Delegated subtask

Part Structure

interface MessagePart {
  type: "text" | "reasoning" | "tool" | "file" | "step-start" | "step-finish" | "subtask" | "other";
  id: string;
  content: string;
  // Tool-specific
  toolName?: string;
  toolStatus?: "pending" | "running" | "completed" | "error";
  toolInput?: Record<string, unknown>;
  toolOutput?: string;
  // File-specific
  filename?: string;
  mimeType?: string;
  // Step-specific
  stepReason?: string;
  // Subtask-specific
  subtaskAgent?: string;
  subtaskDescription?: string;
}

Questions and Permissions

Question Request

interface QuestionRequest {
  id: string;
  sessionID: string;
  questions: Array<{
    header?: string;
    question: string;
    options: Array<{ label: string; description?: string }>;
    multiSelect?: boolean;
  }>;
  tool?: { messageID: string; callID: string };
}

Responding to Questions

// V1 client for question/permission APIs
const clientV1 = createOpencodeClientV1({
  baseUrl: `http://127.0.0.1:${port}`
});

// Reply with answers
await clientV1.question.reply({
  requestID: requestId,
  answers: [["selected option"]]  // Array of selected labels per question
});

// Reject question
await clientV1.question.reject({ requestID: requestId });

Permission Request

interface PermissionRequest {
  id: string;
  sessionID: string;
  permission: string;     // Permission type (e.g., "file:write")
  patterns: string[];     // Affected paths/patterns
  metadata: Record<string, unknown>;
  always: string[];       // Options for "always allow"
  tool?: { messageID: string; callID: string };
}

Responding to Permissions

await clientV1.permission.reply({
  requestID: requestId,
  reply: "once" | "always" | "reject"
});

Provider/Model Discovery

// Get available providers and models
const providerResponse = await client.provider.list({});
const agentResponse = await client.app.agents({});

interface ProviderInfo {
  id: string;
  name: string;
  models: Array<{
    id: string;
    name: string;
    reasoning: boolean;
    toolCall: boolean;
  }>;
}

interface AgentInfo {
  id: string;
  name: string;
  primary: boolean;  // "build" and "plan" are primary
}

Internal Agents (Hidden from UI)

  • compaction
  • title
  • summary

Token Usage

interface TokenUsage {
  input: number;
  output: number;
  reasoning?: number;
  cache?: {
    read: number;
    write: number;
  };
}

Available in message info field for assistant messages.

Agent Modes vs Permission Modes

OpenCode properly separates these concepts:

Agent Modes

Agents are first-class concepts with their own system prompts and behavior:

Agent ID Description
build Default execution agent
plan Planning/analysis agent
Custom User-defined agents in config
// Sending a prompt with specific agent
await client.session.promptAsync({
  body: {
    agent: "plan",  // or "build", or custom agent ID
    parts: [{ type: "text", text: "..." }]
  }
});

Listing Available Agents

const agents = await client.app.agents({});
// Returns: [{ id: "build", name: "Build", primary: true }, ...]

Permission Modes

Permissions are configured via rulesets on the session, separate from agent selection:

interface PermissionRuleset {
  // Tool-specific permission rules
}

Human-in-the-Loop

OpenCode has full interactive HITL via SSE events:

Event Endpoint
question.asked POST /question/{id}/reply
permission.asked POST /permission/{id}/reply

See research/human-in-the-loop.md for full API details.

Defaults

const DEFAULT_OPENCODE_MODEL_ID = "gpt-4o";
const DEFAULT_OPENCODE_PROVIDER_ID = "openai";

Concurrency Control

Server startup uses a lock to prevent race conditions:

async function withStartLock<T>(fn: () => Promise<T>): Promise<T> {
  const prior = startLock;
  let release: () => void;
  startLock = new Promise((resolve) => { release = resolve; });
  await prior;
  try {
    return await fn();
  } finally {
    release();
  }
}

Working Directory

Server must be started in the correct working directory:

async function withWorkingDir<T>(workingDir: string, fn: () => Promise<T>): Promise<T> {
  const previous = process.cwd();
  process.chdir(workingDir);
  try {
    return await fn();
  } finally {
    process.chdir(previous);
  }
}

Polling Fallback

A polling mechanism checks session status every 2 seconds in case SSE events don't arrive:

const pollInterval = setInterval(async () => {
  const session = await client.session.get({ path: { id: sessionId } });
  if (session.data?.status?.type === "idle") {
    abortController.abort();
  }
}, 2000);

Model Discovery

OpenCode has the richest model discovery support with both CLI and HTTP API.

CLI Commands

opencode models                 # List all available models
opencode models <provider>      # List models for a specific provider

HTTP Endpoint

GET /provider

Response Schema

{
  "all": [
    {
      "id": "anthropic",
      "name": "Anthropic",
      "api": "string",
      "env": ["ANTHROPIC_API_KEY"],
      "npm": "string",
      "models": {
        "model-key": {
          "id": "string",
          "name": "string",
          "family": "string",
          "release_date": "string",
          "attachment": true,
          "reasoning": false,
          "tool_call": true,
          "cost": {
            "input": 0.003,
            "output": 0.015,
            "cache_read": 0.0003,
            "cache_write": 0.00375
          },
          "limit": {
            "context": 200000,
            "input": 200000,
            "output": 8192
          },
          "modalities": {
            "input": ["text", "image"],
            "output": ["text"]
          },
          "experimental": false,
          "status": "beta"
        }
      }
    }
  ],
  "default": {
    "anthropic": "claude-sonnet-4-20250514"
  },
  "connected": ["anthropic"]
}

SDK Usage

const client = createOpencodeClient();
const response = await client.provider.list();

How to Replicate

When an OpenCode server is running, call GET /provider on its HTTP port. Returns full model metadata including capabilities, costs, context limits, and modalities.

Command Execution & Process Management

Agent Tool Execution

The agent executes commands via internal tools (not exposed in the HTTP API). The agent's tool calls are synchronous within its turn. Tool parts have states: pending, running, completed, error.

PTY System (/pty/*) - User-Facing Terminals

Separate from the agent's command execution. PTYs are server-scoped interactive terminals for the user:

  • POST /pty - Create PTY (command, args, cwd, title, env)
  • GET /pty - List all PTYs
  • GET /pty/{ptyID} - Get PTY info
  • PUT /pty/{ptyID} - Update PTY (title, resize via size: {rows, cols})
  • DELETE /pty/{ptyID} - Kill and remove PTY
  • GET /pty/{ptyID}/connect - WebSocket for bidirectional I/O

PTY events (globally broadcast via SSE): pty.created, pty.updated, pty.exited, pty.deleted.

The agent does NOT use the PTY system. PTYs are for the user's interactive terminal panel, independent of any AI session.

Session Commands (/session/{id}/command, /session/{id}/shell) - Context Injection

External clients can inject command results into an AI session's conversation context:

  • POST /session/{sessionID}/command - Executes a command and records the result as an AssistantMessage in the session. Required fields: command, arguments. The output becomes part of the AI's context for subsequent turns.
  • POST /session/{sessionID}/shell - Similar but wraps in sh -c. Required fields: command, agent.
  • GET /command - Lists available command definitions (metadata, not execution).

Session commands emit command.executed events with sessionID + messageID.

Key distinction: These endpoints execute commands directly (not via the AI), then inject the output into the session as if the AI produced it. The AI doesn't actively run the command - it just finds the output in its conversation history on the next turn.

Three Separate Execution Mechanisms

Mechanism Who uses it Scoped to AI sees output?
Agent tools (internal) AI agent Session turn Yes (immediate)
PTY (/pty/*) User/frontend Server (global) No
Session commands (/session/{id}/*) Frontend/SDK client Session Yes (next turn)

The agent has no tool to interact with PTYs and cannot access the session command endpoints. When the agent needs to run a background process, it uses its internal bash-equivalent tool with shell backgrounding (&).

Comparison

Capability Supported? Notes
Agent runs commands Yes (internal tools) Synchronous, blocks agent turn
User runs commands → agent sees output Yes (/session/{id}/command) HTTP API, first-class
External API for command injection Yes Session-scoped endpoints
Command source tracking Implicit Endpoint implies source (no enum)
Background process management No Shell & only for agent
PTY / interactive terminal Yes (/pty/*) Server-scoped, WebSocket I/O

Notes

  • OpenCode is the most feature-rich runtime (streaming, questions, permissions)
  • Server persists for the lifetime of a change (workspace+repo+change)
  • Parts are streamed incrementally with delta updates
  • V1 client is needed for question/permission APIs
  • Working directory affects credential discovery and file operations