mirror of https://github.com/harivansh-afk/sandbox-agent.git synced 2026-04-15 07:04:48 +00:00

Nathan Flurry c4153c5335 add agent schemas

2026-01-24 22:37:22 -08:00

11 KiB

Raw Blame History

OpenCode Research

Research notes on OpenCode's configuration, credential discovery, and runtime behavior based on agent-jj implementation.

Overview

Provider: Multi-provider (OpenAI, Anthropic, others)
Execution Method: Embedded server via SDK, or CLI binary
Session Persistence: Session ID (string)
SDK: @opencode-ai/sdk (server + client)
Binary Location: ~/.opencode/bin/opencode
Written in: Go (with Bubble Tea TUI)

CLI Usage (Alternative to SDK)

OpenCode can be used as a standalone binary instead of embedding the SDK:

Interactive TUI Mode

opencode                      # Start TUI in current directory
opencode /path/to/project     # Start in specific directory
opencode -c                   # Continue last session
opencode -s SESSION_ID        # Continue specific session

Non-Interactive Mode (`opencode run`)

opencode run "your prompt here"
opencode run --format json "prompt"   # Raw JSON events output
opencode run -m anthropic/claude-sonnet-4-20250514 "prompt"
opencode run --agent plan "analyze this code"
opencode run -c "follow up question"  # Continue last session
opencode run -s SESSION_ID "prompt"   # Continue specific session
opencode run -f file1.ts -f file2.ts "review these files"

Key CLI Flags

Flag	Description
`--format json`	Output raw JSON events (for parsing)
`-m, --model PROVIDER/MODEL`	Model in format `provider/model`
`--agent AGENT`	Agent to use (`build`, `plan`)
`-c, --continue`	Continue last session
`-s, --session ID`	Continue specific session
`-f, --file FILE`	Attach file(s) to message
`--attach URL`	Attach to running server
`--port PORT`	Local server port
`--variant VARIANT`	Reasoning effort (e.g., `high`, `max`)

Headless Server Mode

opencode serve                        # Start headless server
opencode serve --port 4096            # Specific port
opencode attach http://localhost:4096 # Attach to running server

Other Commands

opencode models                 # List available models
opencode models anthropic       # List models for provider
opencode auth                   # Manage credentials
opencode session                # Manage sessions
opencode export SESSION_ID      # Export session as JSON
opencode stats                  # Token usage statistics

Sources: OpenCode GitHub, OpenCode Docs

Architecture

OpenCode runs as an embedded HTTP server per workspace/change:

┌─────────────────────┐
│   agent-jj backend  │
│                     │
│  ┌───────────────┐  │
│  │ OpenCode      │  │
│  │ Server        │◄─┼── HTTP API
│  │ (per change)  │  │
│  └───────────────┘  │
└─────────────────────┘

One server per changeId (workspace+repo+change combination)
Multiple sessions can share a server
Server runs on dynamic port (4200-4300 range)

Credential Discovery

Priority Order

Environment variables: ANTHROPIC_API_KEY, CLAUDE_API_KEY
Environment variables: OPENAI_API_KEY, CODEX_API_KEY
Claude Code config files
Codex config files
OpenCode config files

Config File Location

Path	Description
`~/.local/share/opencode/auth.json`	Primary auth config

Auth File Structure

{
  "anthropic": {
    "type": "api",
    "key": "sk-ant-..."
  },
  "openai": {
    "type": "api",
    "key": "sk-..."
  },
  "custom-provider": {
    "type": "oauth",
    "access": "token...",
    "refresh": "refresh-token...",
    "expires": 1704067200000
  }
}

Provider Config Types

interface OpenCodeProviderConfig {
  type: "api" | "oauth";
  key?: string;      // For API type
  access?: string;   // For OAuth type
  refresh?: string;  // For OAuth type
  expires?: number;  // Unix timestamp (ms)
}

OAuth tokens are validated for expiry before use.

Server Management

Starting a Server

import { createOpencodeServer } from "@opencode-ai/sdk/server";
import { createOpencodeClient } from "@opencode-ai/sdk";

const server = await createOpencodeServer({
  hostname: "127.0.0.1",
  port: 4200,
  config: { logLevel: "DEBUG" }
});

const client = createOpencodeClient({
  baseUrl: `http://127.0.0.1:${port}`
});

Server Configuration

// From config.json
{
  "opencode": {
    "host": "127.0.0.1",        // Bind address
    "advertisedHost": "127.0.0.1" // External address (for tunnels)
  }
}

Port Selection

Uses get-port package to find available port in range 4200-4300.

Client API

Session Management

// Create session
const response = await client.session.create({});
const sessionId = response.data.id;

// Get session info
const session = await client.session.get({ path: { id: sessionId } });

// Get session messages
const messages = await client.session.messages({ path: { id: sessionId } });

// Get session todos
const todos = await client.session.todo({ path: { id: sessionId } });

Sending Prompts

Synchronous

const response = await client.session.prompt({
  path: { id: sessionId },
  body: {
    model: { providerID: "openai", modelID: "gpt-4o" },
    agent: "build",
    parts: [{ type: "text", text: "prompt text" }]
  }
});

Asynchronous (Streaming)

// Start prompt asynchronously
await client.session.promptAsync({
  path: { id: sessionId },
  body: {
    model: { providerID: "openai", modelID: "gpt-4o" },
    agent: "build",
    parts: [{ type: "text", text: "prompt text" }]
  }
});

// Subscribe to events
const eventStream = await client.event.subscribe({});

for await (const event of eventStream.stream) {
  // Process events
}

Event Types

Event Type	Description
`message.part.updated`	Message part streamed/updated
`session.status`	Session status changed
`session.idle`	Session finished processing
`session.error`	Session error occurred
`question.asked`	AI asking user question
`permission.asked`	AI requesting permission

Event Structure

interface SDKEvent {
  type: string;
  properties: {
    part?: SDKPart & { sessionID?: string };
    delta?: string;          // Text delta for streaming
    status?: { type?: string };
    sessionID?: string;
    error?: { data?: { message?: string } };
    id?: string;
    questions?: QuestionInfo[];
    permission?: string;
    patterns?: string[];
    metadata?: Record<string, unknown>;
    always?: string[];
    tool?: { messageID?: string; callID?: string };
  };
}

Message Parts

OpenCode has rich message part types:

Type	Description
`text`	Plain text content
`reasoning`	Model reasoning (chain-of-thought)
`tool`	Tool invocation with status
`file`	File reference
`step-start`	Step boundary start
`step-finish`	Step boundary end with reason
`subtask`	Delegated subtask

Part Structure

interface MessagePart {
  type: "text" | "reasoning" | "tool" | "file" | "step-start" | "step-finish" | "subtask" | "other";
  id: string;
  content: string;
  // Tool-specific
  toolName?: string;
  toolStatus?: "pending" | "running" | "completed" | "error";
  toolInput?: Record<string, unknown>;
  toolOutput?: string;
  // File-specific
  filename?: string;
  mimeType?: string;
  // Step-specific
  stepReason?: string;
  // Subtask-specific
  subtaskAgent?: string;
  subtaskDescription?: string;
}

Questions and Permissions

Question Request

interface QuestionRequest {
  id: string;
  sessionID: string;
  questions: Array<{
    header?: string;
    question: string;
    options: Array<{ label: string; description?: string }>;
    multiSelect?: boolean;
  }>;
  tool?: { messageID: string; callID: string };
}

Responding to Questions

// V2 client for question/permission APIs
const clientV2 = createOpencodeClientV2({
  baseUrl: `http://127.0.0.1:${port}`
});

// Reply with answers
await clientV2.question.reply({
  requestID: requestId,
  answers: [["selected option"]]  // Array of selected labels per question
});

// Reject question
await clientV2.question.reject({ requestID: requestId });

Permission Request

interface PermissionRequest {
  id: string;
  sessionID: string;
  permission: string;     // Permission type (e.g., "file:write")
  patterns: string[];     // Affected paths/patterns
  metadata: Record<string, unknown>;
  always: string[];       // Options for "always allow"
  tool?: { messageID: string; callID: string };
}

Responding to Permissions

await clientV2.permission.reply({
  requestID: requestId,
  reply: "once" | "always" | "reject"
});

Provider/Model Discovery

// Get available providers and models
const providerResponse = await client.provider.list({});
const agentResponse = await client.app.agents({});

interface ProviderInfo {
  id: string;
  name: string;
  models: Array<{
    id: string;
    name: string;
    reasoning: boolean;
    toolCall: boolean;
  }>;
}

interface AgentInfo {
  id: string;
  name: string;
  primary: boolean;  // "build" and "plan" are primary
}

Internal Agents (Hidden from UI)

compaction
title
summary

Token Usage

interface TokenUsage {
  input: number;
  output: number;
  reasoning?: number;
  cache?: {
    read: number;
    write: number;
  };
}

Available in message info field for assistant messages.

Agent Modes

Mode	Agent ID
`build`	`"build"`
`plan`	`"plan"`

Modes map directly to OpenCode agent IDs.

Defaults

const DEFAULT_OPENCODE_MODEL_ID = "gpt-4o";
const DEFAULT_OPENCODE_PROVIDER_ID = "openai";

Concurrency Control

Server startup uses a lock to prevent race conditions:

async function withStartLock<T>(fn: () => Promise<T>): Promise<T> {
  const prior = startLock;
  let release: () => void;
  startLock = new Promise((resolve) => { release = resolve; });
  await prior;
  try {
    return await fn();
  } finally {
    release();
  }
}

Working Directory

Server must be started in the correct working directory:

async function withWorkingDir<T>(workingDir: string, fn: () => Promise<T>): Promise<T> {
  const previous = process.cwd();
  process.chdir(workingDir);
  try {
    return await fn();
  } finally {
    process.chdir(previous);
  }
}

Polling Fallback

A polling mechanism checks session status every 2 seconds in case SSE events don't arrive:

const pollInterval = setInterval(async () => {
  const session = await client.session.get({ path: { id: sessionId } });
  if (session.data?.status?.type === "idle") {
    abortController.abort();
  }
}, 2000);

Notes

OpenCode is the most feature-rich runtime (streaming, questions, permissions)
Server persists for the lifetime of a change (workspace+repo+change)
Parts are streamed incrementally with delta updates
V2 client is needed for question/permission APIs
Working directory affects credential discovery and file operations

11 KiB Raw Blame History