mirror of
https://github.com/harivansh-afk/sandbox-agent.git
synced 2026-04-19 12:04:12 +00:00
docs: update agent docs with mode/permission info, add amp.md
This commit is contained in:
parent
7e1b63a622
commit
e835f2b29b
4 changed files with 317 additions and 14 deletions
217
research/agents/amp.md
Normal file
217
research/agents/amp.md
Normal file
|
|
@ -0,0 +1,217 @@
|
|||
# Amp Research
|
||||
|
||||
Research notes on Sourcegraph Amp's configuration, credential discovery, and runtime behavior.
|
||||
|
||||
## Overview
|
||||
|
||||
- **Provider**: Anthropic (via Sourcegraph)
|
||||
- **Execution Method**: CLI subprocess (`amp` command)
|
||||
- **Session Persistence**: Session ID (string)
|
||||
- **SDK**: `@sourcegraph/amp-sdk` (closed source)
|
||||
- **Binary Location**: `/usr/local/bin/amp`
|
||||
|
||||
## CLI Usage
|
||||
|
||||
### Interactive Mode
|
||||
```bash
|
||||
amp "your prompt here"
|
||||
amp --model claude-sonnet-4 "your prompt"
|
||||
```
|
||||
|
||||
### Non-Interactive Mode
|
||||
```bash
|
||||
amp --print --output-format stream-json "your prompt"
|
||||
amp --print --output-format stream-json --dangerously-skip-permissions "prompt"
|
||||
amp --continue SESSION_ID "follow up"
|
||||
```
|
||||
|
||||
### Key CLI Flags
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--print` | Output mode (non-interactive) |
|
||||
| `--output-format stream-json` | JSONL streaming output |
|
||||
| `--dangerously-skip-permissions` | Skip permission prompts |
|
||||
| `--continue SESSION_ID` | Resume existing session |
|
||||
| `--model MODEL` | Specify model |
|
||||
| `--toolbox TOOLBOX` | Toolbox configuration |
|
||||
|
||||
## Credential Discovery
|
||||
|
||||
### Priority Order
|
||||
|
||||
1. Environment variable: `ANTHROPIC_API_KEY`
|
||||
2. Sourcegraph authentication
|
||||
3. Claude Code credentials (shared)
|
||||
|
||||
### Config File Locations
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `~/.amp/config.json` | Primary config |
|
||||
| `~/.claude/.credentials.json` | Shared with Claude Code |
|
||||
|
||||
Amp can use Claude Code's OAuth credentials as fallback.
|
||||
|
||||
## Streaming Response Format
|
||||
|
||||
Amp outputs newline-delimited JSON events:
|
||||
|
||||
```json
|
||||
{"type": "system", "subtype": "init", "session_id": "...", "tools": [...]}
|
||||
{"type": "assistant", "message": {...}, "session_id": "..."}
|
||||
{"type": "user", "message": {...}, "session_id": "..."}
|
||||
{"type": "result", "subtype": "success", "result": "...", "session_id": "..."}
|
||||
```
|
||||
|
||||
### Event Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `system` | System initialization with tools list |
|
||||
| `assistant` | Assistant message with content blocks |
|
||||
| `user` | User message (tool results) |
|
||||
| `result` | Final result with session ID |
|
||||
|
||||
### Content Block Types
|
||||
|
||||
```typescript
|
||||
type ContentBlock =
|
||||
| { type: "text"; text: string }
|
||||
| { type: "tool_use"; id: string; name: string; input: Record<string, unknown> }
|
||||
| { type: "thinking"; thinking: string }
|
||||
| { type: "redacted_thinking"; data: string };
|
||||
```
|
||||
|
||||
## Response Schema
|
||||
|
||||
```typescript
|
||||
interface AmpResultMessage {
|
||||
type: "result";
|
||||
subtype: "success";
|
||||
duration_ms: number;
|
||||
is_error: boolean;
|
||||
num_turns: number;
|
||||
result: string;
|
||||
session_id: string;
|
||||
}
|
||||
```
|
||||
|
||||
## Session Management
|
||||
|
||||
- Session ID captured from streaming events
|
||||
- Use `--continue SESSION_ID` to resume
|
||||
- Sessions stored internally by Amp CLI
|
||||
|
||||
## Agent Modes vs Permission Modes
|
||||
|
||||
### Permission Modes (Declarative Rules)
|
||||
|
||||
Amp uses declarative permission rules configured before execution:
|
||||
|
||||
```typescript
|
||||
interface PermissionRule {
|
||||
tool: string; // Glob pattern: "Bash", "mcp__playwright__*"
|
||||
matches?: { [argumentName: string]: string | string[] | boolean };
|
||||
action: "allow" | "reject" | "ask" | "delegate";
|
||||
context?: "thread" | "subagent";
|
||||
}
|
||||
```
|
||||
|
||||
| Action | Behavior |
|
||||
|--------|----------|
|
||||
| `allow` | Automatically permit |
|
||||
| `reject` | Automatically deny |
|
||||
| `ask` | Prompt user (CLI handles internally) |
|
||||
| `delegate` | Delegate to subagent context |
|
||||
|
||||
### Example Rules
|
||||
|
||||
```typescript
|
||||
const permissions: PermissionRule[] = [
|
||||
{ tool: "Read", action: "allow" },
|
||||
{ tool: "Bash", matches: { command: "git *" }, action: "allow" },
|
||||
{ tool: "Write", action: "ask" },
|
||||
{ tool: "mcp__*", action: "reject" }
|
||||
];
|
||||
```
|
||||
|
||||
### Agent Modes
|
||||
|
||||
No documented agent mode concept. Behavior controlled via:
|
||||
- `--toolbox` flag for different tool configurations
|
||||
- Permission rules for capability restrictions
|
||||
|
||||
### Bypass All Permissions
|
||||
|
||||
```bash
|
||||
amp --dangerously-skip-permissions "prompt"
|
||||
```
|
||||
|
||||
Or via SDK:
|
||||
```typescript
|
||||
execute(prompt, { dangerouslyAllowAll: true });
|
||||
```
|
||||
|
||||
## Human-in-the-Loop
|
||||
|
||||
### No Interactive HITL API
|
||||
|
||||
While permission rules support `"ask"` action, Amp does not expose an SDK-level API for programmatically responding to permission requests. The CLI handles user interaction internally.
|
||||
|
||||
For universal API integration, Amp should be run with:
|
||||
- Pre-configured permission rules, or
|
||||
- `dangerouslyAllowAll: true` to bypass
|
||||
|
||||
## SDK Usage
|
||||
|
||||
```typescript
|
||||
import { execute, type AmpOptions } from '@sourcegraph/amp-sdk';
|
||||
|
||||
interface AmpOptions {
|
||||
cwd?: string;
|
||||
dangerouslyAllowAll?: boolean;
|
||||
toolbox?: string;
|
||||
mcpConfig?: MCPConfig;
|
||||
permissions?: PermissionRule[];
|
||||
continue?: boolean | string;
|
||||
}
|
||||
|
||||
const result = await execute(prompt, options);
|
||||
```
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# Get latest version
|
||||
VERSION=$(curl -s https://storage.googleapis.com/amp-public-assets-prod-0/cli/cli-version.txt)
|
||||
|
||||
# Linux x64
|
||||
curl -fsSL "https://storage.googleapis.com/amp-public-assets-prod-0/cli/${VERSION}/amp-linux-x64" \
|
||||
-o /usr/local/bin/amp && chmod +x /usr/local/bin/amp
|
||||
|
||||
# Linux ARM64
|
||||
curl -fsSL "https://storage.googleapis.com/amp-public-assets-prod-0/cli/${VERSION}/amp-linux-arm64" \
|
||||
-o /usr/local/bin/amp && chmod +x /usr/local/bin/amp
|
||||
|
||||
# macOS ARM64
|
||||
curl -fsSL "https://storage.googleapis.com/amp-public-assets-prod-0/cli/${VERSION}/amp-darwin-arm64" \
|
||||
-o /usr/local/bin/amp && chmod +x /usr/local/bin/amp
|
||||
|
||||
# macOS x64
|
||||
curl -fsSL "https://storage.googleapis.com/amp-public-assets-prod-0/cli/${VERSION}/amp-darwin-x64" \
|
||||
-o /usr/local/bin/amp && chmod +x /usr/local/bin/amp
|
||||
```
|
||||
|
||||
## Timeout
|
||||
|
||||
- Default timeout: 5 minutes (300,000 ms)
|
||||
- Process killed with `SIGTERM` on timeout
|
||||
|
||||
## Notes
|
||||
|
||||
- Amp is similar to Claude Code (same streaming format)
|
||||
- Can share credentials with Claude Code
|
||||
- No interactive HITL - must use pre-configured permissions
|
||||
- SDK is closed source but types are documented
|
||||
- MCP server integration supported via `mcpConfig`
|
||||
197
research/agents/claude.md
Normal file
197
research/agents/claude.md
Normal file
|
|
@ -0,0 +1,197 @@
|
|||
# Claude Code Research
|
||||
|
||||
Research notes on Claude Code's configuration, credential discovery, and runtime behavior based on agent-jj implementation.
|
||||
|
||||
## Overview
|
||||
|
||||
- **Provider**: Anthropic
|
||||
- **Execution Method**: CLI subprocess (`claude` command)
|
||||
- **Session Persistence**: Session ID (string)
|
||||
- **SDK**: None (spawns CLI directly)
|
||||
|
||||
## Credential Discovery
|
||||
|
||||
### Priority Order
|
||||
|
||||
1. User-configured credentials (passed as `ANTHROPIC_API_KEY` env var)
|
||||
2. Environment variables: `ANTHROPIC_API_KEY` or `CLAUDE_API_KEY`
|
||||
3. Bootstrap extraction from config files
|
||||
4. OAuth fallback (Claude CLI handles internally)
|
||||
|
||||
### Config File Locations
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `~/.claude.json.api` | API key config (highest priority) |
|
||||
| `~/.claude.json` | General config |
|
||||
| `~/.claude.json.nathan` | User-specific backup (custom) |
|
||||
| `~/.claude/.credentials.json` | OAuth credentials |
|
||||
| `~/.claude-oauth-credentials.json` | Docker mount alternative for OAuth |
|
||||
|
||||
### API Key Field Names (checked in order)
|
||||
|
||||
```json
|
||||
{
|
||||
"primaryApiKey": "sk-ant-...",
|
||||
"apiKey": "sk-ant-...",
|
||||
"anthropicApiKey": "sk-ant-...",
|
||||
"customApiKey": "sk-ant-..."
|
||||
}
|
||||
```
|
||||
|
||||
Keys must start with `sk-ant-` prefix to be valid.
|
||||
|
||||
### OAuth Structure
|
||||
|
||||
```json
|
||||
// ~/.claude/.credentials.json
|
||||
{
|
||||
"claudeAiOauth": {
|
||||
"accessToken": "...",
|
||||
"expiresAt": "2024-01-01T00:00:00Z"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
OAuth tokens are validated for expiry before use.
|
||||
|
||||
## CLI Invocation
|
||||
|
||||
### Command Structure
|
||||
|
||||
```bash
|
||||
claude \
|
||||
--print \
|
||||
--output-format stream-json \
|
||||
--verbose \
|
||||
--dangerously-skip-permissions \
|
||||
[--resume SESSION_ID] \
|
||||
[--model MODEL_ID] \
|
||||
[--permission-mode plan] \
|
||||
"PROMPT"
|
||||
```
|
||||
|
||||
### Arguments
|
||||
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--print` | Output mode |
|
||||
| `--output-format stream-json` | Newline-delimited JSON streaming |
|
||||
| `--verbose` | Verbose output |
|
||||
| `--dangerously-skip-permissions` | Skip permission prompts |
|
||||
| `--resume SESSION_ID` | Resume existing session |
|
||||
| `--model MODEL_ID` | Specify model (e.g., `claude-sonnet-4-20250514`) |
|
||||
| `--permission-mode plan` | Plan mode (read-only exploration) |
|
||||
|
||||
### Environment Variables
|
||||
|
||||
Only `ANTHROPIC_API_KEY` is passed if an API key is found. If no key is found, Claude CLI uses its built-in OAuth flow from `~/.claude/.credentials.json`.
|
||||
|
||||
## Streaming Response Format
|
||||
|
||||
Claude CLI outputs newline-delimited JSON events:
|
||||
|
||||
```json
|
||||
{"type": "assistant", "message": {"content": [{"type": "text", "text": "..."}]}}
|
||||
{"type": "tool_use", "tool_use": {"name": "Read", "input": {...}}}
|
||||
{"type": "result", "result": "Final response text", "session_id": "abc123"}
|
||||
```
|
||||
|
||||
### Event Types
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `assistant` | Assistant message with content blocks |
|
||||
| `tool_use` | Tool invocation |
|
||||
| `tool_result` | Tool result (may include `is_error`) |
|
||||
| `result` | Final result with session ID |
|
||||
|
||||
### Content Block Types
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: "text" | "tool_use";
|
||||
text?: string;
|
||||
name?: string; // tool name
|
||||
input?: object; // tool input
|
||||
}
|
||||
```
|
||||
|
||||
## Response Schema
|
||||
|
||||
```typescript
|
||||
// ClaudeCliResponseSchema
|
||||
{
|
||||
result?: string; // Final response text
|
||||
session_id?: string; // Session ID for resumption
|
||||
structured_output?: unknown; // Optional structured output
|
||||
error?: unknown; // Error information
|
||||
}
|
||||
```
|
||||
|
||||
## Session Management
|
||||
|
||||
- Session ID is captured from streaming events (`event.session_id`)
|
||||
- Use `--resume SESSION_ID` to continue a session
|
||||
- Sessions are stored internally by Claude CLI
|
||||
|
||||
## Timeout
|
||||
|
||||
- Default timeout: 5 minutes (300,000 ms)
|
||||
- Process is killed with `SIGTERM` on timeout
|
||||
|
||||
## Agent Modes vs Permission Modes
|
||||
|
||||
Claude conflates agent mode and permission mode - `plan` is a permission restriction that forces planning behavior.
|
||||
|
||||
### Permission Modes
|
||||
|
||||
| Mode | CLI Flag | Behavior |
|
||||
|------|----------|----------|
|
||||
| `default` | (none) | Normal permission prompts |
|
||||
| `acceptEdits` | `--permission-mode acceptEdits` | Auto-accept file edits |
|
||||
| `plan` | `--permission-mode plan` | Read-only, must ExitPlanMode to execute |
|
||||
| `bypassPermissions` | `--dangerously-skip-permissions` | Skip all permission checks |
|
||||
|
||||
### Subagent Types
|
||||
|
||||
Claude supports spawning subagents via the `Task` tool with `subagent_type`:
|
||||
- Custom agents defined in config
|
||||
- Built-in agents like "Explore", "Plan"
|
||||
|
||||
### ExitPlanMode (Plan Approval)
|
||||
|
||||
When in `plan` permission mode, agent invokes `ExitPlanMode` tool to request execution:
|
||||
|
||||
```typescript
|
||||
interface ExitPlanModeInput {
|
||||
allowedPrompts?: Array<{
|
||||
tool: "Bash";
|
||||
prompt: string; // e.g., "run tests"
|
||||
}>;
|
||||
}
|
||||
```
|
||||
|
||||
This triggers a user approval event. In the universal API, this is converted to a question event with approve/reject options.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- Non-zero exit codes result in errors
|
||||
- stderr is captured and included in error messages
|
||||
- Spawn errors are caught separately
|
||||
|
||||
## Conversion to Universal Format
|
||||
|
||||
Claude output is converted via `convertClaudeOutput()`:
|
||||
|
||||
1. If response is a string, wrap as assistant message
|
||||
2. If response is object with `result` field, extract content
|
||||
3. Parse with `ClaudeCliResponseSchema` as fallback
|
||||
4. Extract `structured_output` as metadata if present
|
||||
|
||||
## Notes
|
||||
|
||||
- Claude CLI manages its own OAuth refresh internally
|
||||
- No SDK dependency - direct CLI subprocess
|
||||
- stdin is closed immediately after spawn
|
||||
- Working directory is set via `cwd` option on spawn
|
||||
270
research/agents/codex.md
Normal file
270
research/agents/codex.md
Normal file
|
|
@ -0,0 +1,270 @@
|
|||
# Codex Research
|
||||
|
||||
Research notes on OpenAI Codex's configuration, credential discovery, and runtime behavior based on agent-jj implementation.
|
||||
|
||||
## Overview
|
||||
|
||||
- **Provider**: OpenAI
|
||||
- **Execution Method**: SDK (`@openai/codex-sdk`) or CLI binary
|
||||
- **Session Persistence**: Thread ID (string)
|
||||
- **Import**: Dynamic import to avoid bundling issues
|
||||
- **Binary Location**: `~/.nvm/versions/node/v24.3.0/bin/codex` (npm global install)
|
||||
|
||||
## SDK Architecture
|
||||
|
||||
**The SDK wraps a bundled binary** - it does NOT make direct API calls.
|
||||
|
||||
- The TypeScript SDK includes a pre-compiled Codex binary
|
||||
- When you use the SDK, it spawns this binary as a child process
|
||||
- Communication happens via stdin/stdout using JSONL (JSON Lines) format
|
||||
- The binary itself handles the actual communication with OpenAI's backend services
|
||||
|
||||
Sources: [Codex SDK docs](https://developers.openai.com/codex/sdk/), [GitHub](https://github.com/openai/codex)
|
||||
|
||||
## CLI Usage (Alternative to SDK)
|
||||
|
||||
You can use the `codex` binary directly instead of the SDK:
|
||||
|
||||
### Interactive Mode
|
||||
```bash
|
||||
codex "your prompt here"
|
||||
codex --model o3 "your prompt"
|
||||
```
|
||||
|
||||
### Non-Interactive Mode (`codex exec`)
|
||||
```bash
|
||||
codex exec "your prompt here"
|
||||
codex exec --json "your prompt" # JSONL output
|
||||
codex exec -m o3 "your prompt"
|
||||
codex exec --dangerously-bypass-approvals-and-sandbox "prompt"
|
||||
codex exec resume --last # Resume previous session
|
||||
```
|
||||
|
||||
### Key CLI Flags
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--json` | Print events to stdout as JSONL |
|
||||
| `-m, --model MODEL` | Model to use |
|
||||
| `-s, --sandbox MODE` | `read-only`, `workspace-write`, `danger-full-access` |
|
||||
| `--full-auto` | Auto-approve with workspace-write sandbox |
|
||||
| `--dangerously-bypass-approvals-and-sandbox` | Skip all prompts (dangerous) |
|
||||
| `-C, --cd DIR` | Working directory |
|
||||
| `-o, --output-last-message FILE` | Write final response to file |
|
||||
| `--output-schema FILE` | JSON Schema for structured output |
|
||||
|
||||
### Session Management
|
||||
```bash
|
||||
codex resume # Pick from previous sessions
|
||||
codex resume --last # Resume most recent
|
||||
codex fork --last # Fork most recent session
|
||||
```
|
||||
|
||||
## Credential Discovery
|
||||
|
||||
### Priority Order
|
||||
|
||||
1. User-configured credentials (from `credentials` array)
|
||||
2. Environment variable: `CODEX_API_KEY`
|
||||
3. Environment variable: `OPENAI_API_KEY`
|
||||
4. Bootstrap extraction from config files
|
||||
|
||||
### Config File Location
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `~/.codex/auth.json` | Primary auth config |
|
||||
|
||||
### Auth File Structure
|
||||
|
||||
```json
|
||||
// API Key authentication
|
||||
{
|
||||
"OPENAI_API_KEY": "sk-..."
|
||||
}
|
||||
|
||||
// OAuth authentication
|
||||
{
|
||||
"tokens": {
|
||||
"access_token": "..."
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## SDK Usage
|
||||
|
||||
### Client Initialization
|
||||
|
||||
```typescript
|
||||
import { Codex } from "@openai/codex-sdk";
|
||||
|
||||
// With API key
|
||||
const codex = new Codex({ apiKey: "sk-..." });
|
||||
|
||||
// Without API key (uses default auth)
|
||||
const codex = new Codex();
|
||||
```
|
||||
|
||||
Dynamic import is used to avoid bundling the SDK:
|
||||
```typescript
|
||||
const { Codex } = await import("@openai/codex-sdk");
|
||||
```
|
||||
|
||||
### Thread Management
|
||||
|
||||
```typescript
|
||||
// Start new thread
|
||||
const thread = codex.startThread();
|
||||
|
||||
// Resume existing thread
|
||||
const thread = codex.resumeThread(threadId);
|
||||
```
|
||||
|
||||
### Running Prompts
|
||||
|
||||
```typescript
|
||||
const { events } = await thread.runStreamed(prompt);
|
||||
|
||||
for await (const event of events) {
|
||||
// Process events
|
||||
}
|
||||
```
|
||||
|
||||
## Event Types
|
||||
|
||||
| Event Type | Description |
|
||||
|------------|-------------|
|
||||
| `thread.started` | Thread initialized, contains `thread_id` |
|
||||
| `item.completed` | Item finished, check for `agent_message` type |
|
||||
| `turn.failed` | Turn failed with error message |
|
||||
|
||||
### Event Structure
|
||||
|
||||
```typescript
|
||||
// thread.started
|
||||
{
|
||||
type: "thread.started",
|
||||
thread_id: "thread_abc123"
|
||||
}
|
||||
|
||||
// item.completed (agent message)
|
||||
{
|
||||
type: "item.completed",
|
||||
item: {
|
||||
type: "agent_message",
|
||||
text: "Response text"
|
||||
}
|
||||
}
|
||||
|
||||
// turn.failed
|
||||
{
|
||||
type: "turn.failed",
|
||||
error: {
|
||||
message: "Error description"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Response Schema
|
||||
|
||||
```typescript
|
||||
// CodexRunResultSchema
|
||||
type CodexRunResult = string | {
|
||||
result?: string;
|
||||
output?: string;
|
||||
message?: string;
|
||||
// ...additional fields via passthrough
|
||||
};
|
||||
```
|
||||
|
||||
Content is extracted in priority order: `result` > `output` > `message`
|
||||
|
||||
## Thread ID Retrieval
|
||||
|
||||
Thread ID can be obtained from multiple sources:
|
||||
|
||||
1. `thread.started` event's `thread_id` property
|
||||
2. Thread object's `id` getter (after first turn)
|
||||
3. Thread object's `threadId` or `_id` properties (fallbacks)
|
||||
|
||||
```typescript
|
||||
function getThreadId(thread: unknown): string | null {
|
||||
const value = thread as { id?: string; threadId?: string; _id?: string };
|
||||
return value.id ?? value.threadId ?? value._id ?? null;
|
||||
}
|
||||
```
|
||||
|
||||
## Agent Modes vs Permission Modes
|
||||
|
||||
Codex separates sandbox levels (permissions) from behavioral modes (prompt prefixes).
|
||||
|
||||
### Permission Modes (Sandbox Levels)
|
||||
|
||||
| Mode | CLI Flag | Behavior |
|
||||
|------|----------|----------|
|
||||
| `read-only` | `-s read-only` | No file modifications |
|
||||
| `workspace-write` | `-s workspace-write` | Can modify workspace files |
|
||||
| `danger-full-access` | `-s danger-full-access` | Full system access |
|
||||
| `bypass` | `--dangerously-bypass-approvals-and-sandbox` | Skip all checks |
|
||||
|
||||
### Agent Modes (Prompt Prefixes)
|
||||
|
||||
Codex doesn't have true agent modes - behavior is controlled via prompt prefixing:
|
||||
|
||||
| Mode | Prompt Prefix |
|
||||
|------|---------------|
|
||||
| `build` | No prefix (default) |
|
||||
| `plan` | `"Make a plan before acting.\n\n"` |
|
||||
| `chat` | `"Answer conversationally.\n\n"` |
|
||||
|
||||
```typescript
|
||||
function withModePrefix(prompt: string, mode: AgentMode): string {
|
||||
if (mode === "plan") {
|
||||
return `Make a plan before acting.\n\n${prompt}`;
|
||||
}
|
||||
if (mode === "chat") {
|
||||
return `Answer conversationally.\n\n${prompt}`;
|
||||
}
|
||||
return prompt;
|
||||
}
|
||||
```
|
||||
|
||||
### Human-in-the-Loop
|
||||
|
||||
Codex has no interactive HITL in SDK mode. All permissions must be configured upfront via sandbox level.
|
||||
|
||||
## Error Handling
|
||||
|
||||
- `turn.failed` events are captured but don't throw
|
||||
- Thread ID is still returned on error for potential resumption
|
||||
- Events iterator may throw after errors - caught and logged
|
||||
|
||||
```typescript
|
||||
interface CodexPromptResult {
|
||||
result: unknown;
|
||||
threadId?: string | null;
|
||||
error?: string; // Set if turn failed
|
||||
}
|
||||
```
|
||||
|
||||
## Conversion to Universal Format
|
||||
|
||||
Codex output is converted via `convertCodexOutput()`:
|
||||
|
||||
1. Parse with `CodexRunResultSchema`
|
||||
2. If result is string, use directly
|
||||
3. Otherwise extract from `result`, `output`, or `message` fields
|
||||
4. Wrap as assistant message entry
|
||||
|
||||
## Session Continuity
|
||||
|
||||
- Thread ID persists across prompts
|
||||
- Use `resumeThread(threadId)` to continue conversation
|
||||
- Thread ID is captured from `thread.started` event or thread object
|
||||
|
||||
## Notes
|
||||
|
||||
- SDK is dynamically imported to reduce bundle size
|
||||
- No explicit timeout (relies on SDK defaults)
|
||||
- Thread ID may not be available until first event
|
||||
- Error messages are preserved for debugging
|
||||
- Working directory is not explicitly set (SDK handles internally)
|
||||
518
research/agents/opencode.md
Normal file
518
research/agents/opencode.md
Normal file
|
|
@ -0,0 +1,518 @@
|
|||
# OpenCode Research
|
||||
|
||||
Research notes on OpenCode's configuration, credential discovery, and runtime behavior based on agent-jj implementation.
|
||||
|
||||
## Overview
|
||||
|
||||
- **Provider**: Multi-provider (OpenAI, Anthropic, others)
|
||||
- **Execution Method**: Embedded server via SDK, or CLI binary
|
||||
- **Session Persistence**: Session ID (string)
|
||||
- **SDK**: `@opencode-ai/sdk` (server + client)
|
||||
- **Binary Location**: `~/.opencode/bin/opencode`
|
||||
- **Written in**: Go (with Bubble Tea TUI)
|
||||
|
||||
## CLI Usage (Alternative to SDK)
|
||||
|
||||
OpenCode can be used as a standalone binary instead of embedding the SDK:
|
||||
|
||||
### Interactive TUI Mode
|
||||
```bash
|
||||
opencode # Start TUI in current directory
|
||||
opencode /path/to/project # Start in specific directory
|
||||
opencode -c # Continue last session
|
||||
opencode -s SESSION_ID # Continue specific session
|
||||
```
|
||||
|
||||
### Non-Interactive Mode (`opencode run`)
|
||||
```bash
|
||||
opencode run "your prompt here"
|
||||
opencode run --format json "prompt" # Raw JSON events output
|
||||
opencode run -m anthropic/claude-sonnet-4-20250514 "prompt"
|
||||
opencode run --agent plan "analyze this code"
|
||||
opencode run -c "follow up question" # Continue last session
|
||||
opencode run -s SESSION_ID "prompt" # Continue specific session
|
||||
opencode run -f file1.ts -f file2.ts "review these files"
|
||||
```
|
||||
|
||||
### Key CLI Flags
|
||||
| Flag | Description |
|
||||
|------|-------------|
|
||||
| `--format json` | Output raw JSON events (for parsing) |
|
||||
| `-m, --model PROVIDER/MODEL` | Model in format `provider/model` |
|
||||
| `--agent AGENT` | Agent to use (`build`, `plan`) |
|
||||
| `-c, --continue` | Continue last session |
|
||||
| `-s, --session ID` | Continue specific session |
|
||||
| `-f, --file FILE` | Attach file(s) to message |
|
||||
| `--attach URL` | Attach to running server |
|
||||
| `--port PORT` | Local server port |
|
||||
| `--variant VARIANT` | Reasoning effort (e.g., `high`, `max`) |
|
||||
|
||||
### Headless Server Mode
|
||||
```bash
|
||||
opencode serve # Start headless server
|
||||
opencode serve --port 4096 # Specific port
|
||||
opencode attach http://localhost:4096 # Attach to running server
|
||||
```
|
||||
|
||||
### Other Commands
|
||||
```bash
|
||||
opencode models # List available models
|
||||
opencode models anthropic # List models for provider
|
||||
opencode auth # Manage credentials
|
||||
opencode session # Manage sessions
|
||||
opencode export SESSION_ID # Export session as JSON
|
||||
opencode stats # Token usage statistics
|
||||
```
|
||||
|
||||
Sources: [OpenCode GitHub](https://github.com/opencode-ai/opencode), [OpenCode Docs](https://opencode.ai/docs/cli/)
|
||||
|
||||
## Architecture
|
||||
|
||||
OpenCode runs as an embedded HTTP server per workspace/change:
|
||||
|
||||
```
|
||||
┌─────────────────────┐
|
||||
│ agent-jj backend │
|
||||
│ │
|
||||
│ ┌───────────────┐ │
|
||||
│ │ OpenCode │ │
|
||||
│ │ Server │◄─┼── HTTP API
|
||||
│ │ (per change) │ │
|
||||
│ └───────────────┘ │
|
||||
└─────────────────────┘
|
||||
```
|
||||
|
||||
- One server per `changeId` (workspace+repo+change combination)
|
||||
- Multiple sessions can share a server
|
||||
- Server runs on dynamic port (4200-4300 range)
|
||||
|
||||
## Credential Discovery
|
||||
|
||||
### Priority Order
|
||||
|
||||
1. Environment variables: `ANTHROPIC_API_KEY`, `CLAUDE_API_KEY`
|
||||
2. Environment variables: `OPENAI_API_KEY`, `CODEX_API_KEY`
|
||||
3. Claude Code config files
|
||||
4. Codex config files
|
||||
5. OpenCode config files
|
||||
|
||||
### Config File Location
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `~/.local/share/opencode/auth.json` | Primary auth config |
|
||||
|
||||
### Auth File Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"anthropic": {
|
||||
"type": "api",
|
||||
"key": "sk-ant-..."
|
||||
},
|
||||
"openai": {
|
||||
"type": "api",
|
||||
"key": "sk-..."
|
||||
},
|
||||
"custom-provider": {
|
||||
"type": "oauth",
|
||||
"access": "token...",
|
||||
"refresh": "refresh-token...",
|
||||
"expires": 1704067200000
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Provider Config Types
|
||||
|
||||
```typescript
|
||||
interface OpenCodeProviderConfig {
|
||||
type: "api" | "oauth";
|
||||
key?: string; // For API type
|
||||
access?: string; // For OAuth type
|
||||
refresh?: string; // For OAuth type
|
||||
expires?: number; // Unix timestamp (ms)
|
||||
}
|
||||
```
|
||||
|
||||
OAuth tokens are validated for expiry before use.
|
||||
|
||||
## Server Management
|
||||
|
||||
### Starting a Server
|
||||
|
||||
```typescript
|
||||
import { createOpencodeServer } from "@opencode-ai/sdk/server";
|
||||
import { createOpencodeClient } from "@opencode-ai/sdk";
|
||||
|
||||
const server = await createOpencodeServer({
|
||||
hostname: "127.0.0.1",
|
||||
port: 4200,
|
||||
config: { logLevel: "DEBUG" }
|
||||
});
|
||||
|
||||
const client = createOpencodeClient({
|
||||
baseUrl: `http://127.0.0.1:${port}`
|
||||
});
|
||||
```
|
||||
|
||||
### Server Configuration
|
||||
|
||||
```typescript
|
||||
// From config.json
|
||||
{
|
||||
"opencode": {
|
||||
"host": "127.0.0.1", // Bind address
|
||||
"advertisedHost": "127.0.0.1" // External address (for tunnels)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Port Selection
|
||||
|
||||
Uses `get-port` package to find available port in range 4200-4300.
|
||||
|
||||
## Client API
|
||||
|
||||
### Session Management
|
||||
|
||||
```typescript
|
||||
// Create session
|
||||
const response = await client.session.create({});
|
||||
const sessionId = response.data.id;
|
||||
|
||||
// Get session info
|
||||
const session = await client.session.get({ path: { id: sessionId } });
|
||||
|
||||
// Get session messages
|
||||
const messages = await client.session.messages({ path: { id: sessionId } });
|
||||
|
||||
// Get session todos
|
||||
const todos = await client.session.todo({ path: { id: sessionId } });
|
||||
```
|
||||
|
||||
### Sending Prompts
|
||||
|
||||
#### Synchronous
|
||||
|
||||
```typescript
|
||||
const response = await client.session.prompt({
|
||||
path: { id: sessionId },
|
||||
body: {
|
||||
model: { providerID: "openai", modelID: "gpt-4o" },
|
||||
agent: "build",
|
||||
parts: [{ type: "text", text: "prompt text" }]
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
#### Asynchronous (Streaming)
|
||||
|
||||
```typescript
|
||||
// Start prompt asynchronously
|
||||
await client.session.promptAsync({
|
||||
path: { id: sessionId },
|
||||
body: {
|
||||
model: { providerID: "openai", modelID: "gpt-4o" },
|
||||
agent: "build",
|
||||
parts: [{ type: "text", text: "prompt text" }]
|
||||
}
|
||||
});
|
||||
|
||||
// Subscribe to events
|
||||
const eventStream = await client.event.subscribe({});
|
||||
|
||||
for await (const event of eventStream.stream) {
|
||||
// Process events
|
||||
}
|
||||
```
|
||||
|
||||
## Event Types
|
||||
|
||||
| Event Type | Description |
|
||||
|------------|-------------|
|
||||
| `message.part.updated` | Message part streamed/updated |
|
||||
| `session.status` | Session status changed |
|
||||
| `session.idle` | Session finished processing |
|
||||
| `session.error` | Session error occurred |
|
||||
| `question.asked` | AI asking user question |
|
||||
| `permission.asked` | AI requesting permission |
|
||||
|
||||
### Event Structure
|
||||
|
||||
```typescript
|
||||
interface SDKEvent {
|
||||
type: string;
|
||||
properties: {
|
||||
part?: SDKPart & { sessionID?: string };
|
||||
delta?: string; // Text delta for streaming
|
||||
status?: { type?: string };
|
||||
sessionID?: string;
|
||||
error?: { data?: { message?: string } };
|
||||
id?: string;
|
||||
questions?: QuestionInfo[];
|
||||
permission?: string;
|
||||
patterns?: string[];
|
||||
metadata?: Record<string, unknown>;
|
||||
always?: string[];
|
||||
tool?: { messageID?: string; callID?: string };
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
## Message Parts
|
||||
|
||||
OpenCode has rich message part types:
|
||||
|
||||
| Type | Description |
|
||||
|------|-------------|
|
||||
| `text` | Plain text content |
|
||||
| `reasoning` | Model reasoning (chain-of-thought) |
|
||||
| `tool` | Tool invocation with status |
|
||||
| `file` | File reference |
|
||||
| `step-start` | Step boundary start |
|
||||
| `step-finish` | Step boundary end with reason |
|
||||
| `subtask` | Delegated subtask |
|
||||
|
||||
### Part Structure
|
||||
|
||||
```typescript
|
||||
interface MessagePart {
|
||||
type: "text" | "reasoning" | "tool" | "file" | "step-start" | "step-finish" | "subtask" | "other";
|
||||
id: string;
|
||||
content: string;
|
||||
// Tool-specific
|
||||
toolName?: string;
|
||||
toolStatus?: "pending" | "running" | "completed" | "error";
|
||||
toolInput?: Record<string, unknown>;
|
||||
toolOutput?: string;
|
||||
// File-specific
|
||||
filename?: string;
|
||||
mimeType?: string;
|
||||
// Step-specific
|
||||
stepReason?: string;
|
||||
// Subtask-specific
|
||||
subtaskAgent?: string;
|
||||
subtaskDescription?: string;
|
||||
}
|
||||
```
|
||||
|
||||
## Questions and Permissions
|
||||
|
||||
### Question Request
|
||||
|
||||
```typescript
|
||||
interface QuestionRequest {
|
||||
id: string;
|
||||
sessionID: string;
|
||||
questions: Array<{
|
||||
header?: string;
|
||||
question: string;
|
||||
options: Array<{ label: string; description?: string }>;
|
||||
multiSelect?: boolean;
|
||||
}>;
|
||||
tool?: { messageID: string; callID: string };
|
||||
}
|
||||
```
|
||||
|
||||
### Responding to Questions
|
||||
|
||||
```typescript
|
||||
// V2 client for question/permission APIs
|
||||
const clientV2 = createOpencodeClientV2({
|
||||
baseUrl: `http://127.0.0.1:${port}`
|
||||
});
|
||||
|
||||
// Reply with answers
|
||||
await clientV2.question.reply({
|
||||
requestID: requestId,
|
||||
answers: [["selected option"]] // Array of selected labels per question
|
||||
});
|
||||
|
||||
// Reject question
|
||||
await clientV2.question.reject({ requestID: requestId });
|
||||
```
|
||||
|
||||
### Permission Request
|
||||
|
||||
```typescript
|
||||
interface PermissionRequest {
|
||||
id: string;
|
||||
sessionID: string;
|
||||
permission: string; // Permission type (e.g., "file:write")
|
||||
patterns: string[]; // Affected paths/patterns
|
||||
metadata: Record<string, unknown>;
|
||||
always: string[]; // Options for "always allow"
|
||||
tool?: { messageID: string; callID: string };
|
||||
}
|
||||
```
|
||||
|
||||
### Responding to Permissions
|
||||
|
||||
```typescript
|
||||
await clientV2.permission.reply({
|
||||
requestID: requestId,
|
||||
reply: "once" | "always" | "reject"
|
||||
});
|
||||
```
|
||||
|
||||
## Provider/Model Discovery
|
||||
|
||||
```typescript
|
||||
// Get available providers and models
|
||||
const providerResponse = await client.provider.list({});
|
||||
const agentResponse = await client.app.agents({});
|
||||
|
||||
interface ProviderInfo {
|
||||
id: string;
|
||||
name: string;
|
||||
models: Array<{
|
||||
id: string;
|
||||
name: string;
|
||||
reasoning: boolean;
|
||||
toolCall: boolean;
|
||||
}>;
|
||||
}
|
||||
|
||||
interface AgentInfo {
|
||||
id: string;
|
||||
name: string;
|
||||
primary: boolean; // "build" and "plan" are primary
|
||||
}
|
||||
```
|
||||
|
||||
### Internal Agents (Hidden from UI)
|
||||
|
||||
- `compaction`
|
||||
- `title`
|
||||
- `summary`
|
||||
|
||||
## Token Usage
|
||||
|
||||
```typescript
|
||||
interface TokenUsage {
|
||||
input: number;
|
||||
output: number;
|
||||
reasoning?: number;
|
||||
cache?: {
|
||||
read: number;
|
||||
write: number;
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
Available in message `info` field for assistant messages.
|
||||
|
||||
## Agent Modes vs Permission Modes
|
||||
|
||||
OpenCode properly separates these concepts:
|
||||
|
||||
### Agent Modes
|
||||
|
||||
Agents are first-class concepts with their own system prompts and behavior:
|
||||
|
||||
| Agent ID | Description |
|
||||
|----------|-------------|
|
||||
| `build` | Default execution agent |
|
||||
| `plan` | Planning/analysis agent |
|
||||
| Custom | User-defined agents in config |
|
||||
|
||||
```typescript
|
||||
// Sending a prompt with specific agent
|
||||
await client.session.promptAsync({
|
||||
body: {
|
||||
agent: "plan", // or "build", or custom agent ID
|
||||
parts: [{ type: "text", text: "..." }]
|
||||
}
|
||||
});
|
||||
```
|
||||
|
||||
### Listing Available Agents
|
||||
|
||||
```typescript
|
||||
const agents = await client.app.agents({});
|
||||
// Returns: [{ id: "build", name: "Build", primary: true }, ...]
|
||||
```
|
||||
|
||||
### Permission Modes
|
||||
|
||||
Permissions are configured via rulesets on the session, separate from agent selection:
|
||||
|
||||
```typescript
|
||||
interface PermissionRuleset {
|
||||
// Tool-specific permission rules
|
||||
}
|
||||
```
|
||||
|
||||
### Human-in-the-Loop
|
||||
|
||||
OpenCode has full interactive HITL via SSE events:
|
||||
|
||||
| Event | Endpoint |
|
||||
|-------|----------|
|
||||
| `question.asked` | `POST /question/{id}/reply` |
|
||||
| `permission.asked` | `POST /permission/{id}/reply` |
|
||||
|
||||
See `research/human-in-the-loop.md` for full API details.
|
||||
|
||||
## Defaults
|
||||
|
||||
```typescript
|
||||
const DEFAULT_OPENCODE_MODEL_ID = "gpt-4o";
|
||||
const DEFAULT_OPENCODE_PROVIDER_ID = "openai";
|
||||
```
|
||||
|
||||
## Concurrency Control
|
||||
|
||||
Server startup uses a lock to prevent race conditions:
|
||||
|
||||
```typescript
|
||||
async function withStartLock<T>(fn: () => Promise<T>): Promise<T> {
|
||||
const prior = startLock;
|
||||
let release: () => void;
|
||||
startLock = new Promise((resolve) => { release = resolve; });
|
||||
await prior;
|
||||
try {
|
||||
return await fn();
|
||||
} finally {
|
||||
release();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Working Directory
|
||||
|
||||
Server must be started in the correct working directory:
|
||||
|
||||
```typescript
|
||||
async function withWorkingDir<T>(workingDir: string, fn: () => Promise<T>): Promise<T> {
|
||||
const previous = process.cwd();
|
||||
process.chdir(workingDir);
|
||||
try {
|
||||
return await fn();
|
||||
} finally {
|
||||
process.chdir(previous);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Polling Fallback
|
||||
|
||||
A polling mechanism checks session status every 2 seconds in case SSE events don't arrive:
|
||||
|
||||
```typescript
|
||||
const pollInterval = setInterval(async () => {
|
||||
const session = await client.session.get({ path: { id: sessionId } });
|
||||
if (session.data?.status?.type === "idle") {
|
||||
abortController.abort();
|
||||
}
|
||||
}, 2000);
|
||||
```
|
||||
|
||||
## Notes
|
||||
|
||||
- OpenCode is the most feature-rich runtime (streaming, questions, permissions)
|
||||
- Server persists for the lifetime of a change (workspace+repo+change)
|
||||
- Parts are streamed incrementally with delta updates
|
||||
- V2 client is needed for question/permission APIs
|
||||
- Working directory affects credential discovery and file operations
|
||||
Loading…
Add table
Add a link
Reference in a new issue