mirror of https://github.com/harivansh-afk/sandbox-agent.git synced 2026-04-15 17:01:02 +00:00

Nathan Flurry 4b5b390b7f feat: migrate codex app server

2026-01-26 21:50:37 -08:00

7 KiB

Raw Blame History

Codex Research

Research notes on OpenAI Codex's configuration, credential discovery, and runtime behavior based on agent-jj implementation.

Overview

Provider: OpenAI
Execution Method (this repo): Codex App Server (JSON-RPC over stdio)
Execution Method (alternatives): SDK (@openai/codex-sdk) or CLI binary
Session Persistence: Thread ID (string)
Import: Dynamic import to avoid bundling issues
Binary Location: ~/.nvm/versions/node/v24.3.0/bin/codex (npm global install)

SDK Architecture

The SDK wraps a bundled binary - it does NOT make direct API calls.

The TypeScript SDK includes a pre-compiled Codex binary
When you use the SDK, it spawns this binary as a child process
Communication happens via stdin/stdout using JSONL (JSON Lines) format
The binary itself handles the actual communication with OpenAI's backend services

Sources: Codex SDK docs, GitHub

CLI Usage (Alternative to App Server / SDK)

You can use the codex binary directly instead of the SDK:

Interactive Mode

codex "your prompt here"
codex --model o3 "your prompt"

Non-Interactive Mode (`codex exec`)

codex exec "your prompt here"
codex exec --json "your prompt"  # JSONL output
codex exec -m o3 "your prompt"
codex exec --dangerously-bypass-approvals-and-sandbox "prompt"
codex exec resume --last  # Resume previous session

Key CLI Flags

Flag	Description
`--json`	Print events to stdout as JSONL
`-m, --model MODEL`	Model to use
`-s, --sandbox MODE`	`read-only`, `workspace-write`, `danger-full-access`
`--full-auto`	Auto-approve with workspace-write sandbox
`--dangerously-bypass-approvals-and-sandbox`	Skip all prompts (dangerous)
`-C, --cd DIR`	Working directory
`-o, --output-last-message FILE`	Write final response to file
`--output-schema FILE`	JSON Schema for structured output

Session Management

codex resume          # Pick from previous sessions
codex resume --last   # Resume most recent
codex fork --last     # Fork most recent session

Credential Discovery

Priority Order

User-configured credentials (from credentials array)
Environment variable: CODEX_API_KEY
Environment variable: OPENAI_API_KEY
Bootstrap extraction from config files

Config File Location

Path	Description
`~/.codex/auth.json`	Primary auth config

Auth File Structure

// API Key authentication
{
  "OPENAI_API_KEY": "sk-..."
}

// OAuth authentication
{
  "tokens": {
    "access_token": "..."
  }
}

SDK Usage

Client Initialization

import { Codex } from "@openai/codex-sdk";

// With API key
const codex = new Codex({ apiKey: "sk-..." });

// Without API key (uses default auth)
const codex = new Codex();

Dynamic import is used to avoid bundling the SDK:

const { Codex } = await import("@openai/codex-sdk");

Thread Management

// Start new thread
const thread = codex.startThread();

// Resume existing thread
const thread = codex.resumeThread(threadId);

Running Prompts

const { events } = await thread.runStreamed(prompt);

for await (const event of events) {
  // Process events
}

App Server Protocol (JSON-RPC)

Codex App Server uses JSON-RPC 2.0 over JSONL/stdin/stdout (no port required).

Key Requests

initialize → returns server info
thread/start → starts a new thread
turn/start → sends user input for a thread

Event Notifications (examples)

{ "method": "thread/started", "params": { "thread": { "id": "thread_abc123" } } }
{ "method": "item/completed", "params": { "item": { "type": "agentMessage", "text": "..." } } }
{ "method": "turn/completed", "params": { "threadId": "thread_abc123", "turn": { "items": [] } } }

Approval Requests (server → client)

The server can send JSON-RPC requests (with id) for approvals:

item/commandExecution/requestApproval
item/fileChange/requestApproval

These require JSON-RPC responses with a decision payload.

Response Schema

// CodexRunResultSchema
type CodexRunResult = string | {
  result?: string;
  output?: string;
  message?: string;
  // ...additional fields via passthrough
};

Content is extracted in priority order: result > output > message

Thread ID Retrieval

Thread ID can be obtained from multiple sources:

thread.started event's thread_id property
Thread object's id getter (after first turn)
Thread object's threadId or _id properties (fallbacks)

function getThreadId(thread: unknown): string | null {
  const value = thread as { id?: string; threadId?: string; _id?: string };
  return value.id ?? value.threadId ?? value._id ?? null;
}

Agent Modes vs Permission Modes

Codex separates sandbox levels (permissions) from behavioral modes (prompt prefixes).

Permission Modes (Sandbox Levels)

Mode	CLI Flag	Behavior
`read-only`	`-s read-only`	No file modifications
`workspace-write`	`-s workspace-write`	Can modify workspace files
`danger-full-access`	`-s danger-full-access`	Full system access
`bypass`	`--dangerously-bypass-approvals-and-sandbox`	Skip all checks

Agent Modes (Prompt Prefixes)

Codex doesn't have true agent modes - behavior is controlled via prompt prefixing:

Mode	Prompt Prefix
`build`	No prefix (default)
`plan`	`"Make a plan before acting.\n\n"`
`chat`	`"Answer conversationally.\n\n"`

function withModePrefix(prompt: string, mode: AgentMode): string {
  if (mode === "plan") {
    return `Make a plan before acting.\n\n${prompt}`;
  }
  if (mode === "chat") {
    return `Answer conversationally.\n\n${prompt}`;
  }
  return prompt;
}

Human-in-the-Loop

Codex has no interactive HITL in SDK mode. All permissions must be configured upfront via sandbox level.

Error Handling

turn.failed events are captured but don't throw
Thread ID is still returned on error for potential resumption
Events iterator may throw after errors - caught and logged

interface CodexPromptResult {
  result: unknown;
  threadId?: string | null;
  error?: string;  // Set if turn failed
}

Conversion to Universal Format

Codex output is converted via convertCodexOutput():

Parse with CodexRunResultSchema
If result is string, use directly
Otherwise extract from result, output, or message fields
Wrap as assistant message entry

Session Continuity

Thread ID persists across prompts
Use resumeThread(threadId) to continue conversation
Thread ID is captured from thread.started event or thread object

Notes

SDK is dynamically imported to reduce bundle size
No explicit timeout (relies on SDK defaults)
Thread ID may not be available until first event
Error messages are preserved for debugging
Working directory is not explicitly set (SDK handles internally)

7 KiB Raw Blame History