Committing manually like the monkey I am

2026-04-16 17:01:02 +00:00 · 2025-10-12 02:59:46 +02:00 · 2025-10-12 02:59:46 +02:00 · f93e72a805
commit f93e72a805
parent 26b774bb04
10 changed files with 1819 additions and 11 deletions
--- a/docs/agent.md
+++ b/docs/agent.md
@ -0,0 +1,988 @@
+# Coding Agent Architecture
+
+## Executive Summary
+
+This document proposes extracting the agent infrastructure from `@mariozechner/pi-web-ui` and `@mariozechner/pi-agent` into a new headless coding agent package that can be reused across multiple UI implementations (TUI, VS Code extension, web interface).
+
+The new architecture will provide:
+- **Headless agent core** with file manipulation tools (read, bash, edit, write)
+- **Session management** for conversation persistence and resume capability
+- **Full abort support** throughout the execution pipeline
+- **Event-driven API** for flexible UI integration
+- **Clean separation** between agent logic and presentation layer
+
+## Current Architecture Analysis
+
+### Package Overview
+
+```
+pi-mono/
+├── packages/ai/              # Core AI streaming (GOOD - keep as-is)
+├── packages/web-ui/          # Web UI with agent (GOOD - keep separate)
+├── packages/agent/           # OLD - needs to be replaced
+├── packages/tui/             # Terminal UI lib (GOOD - low-level primitives)
+├── packages/proxy/           # CORS proxy (unrelated)
+└── packages/pods/            # GPU deployment tool (unrelated)
+```
+
+### packages/ai - Core Streaming Library
+
+**Status:** ✅ Solid foundation, keep as-is
+
+**Architecture:**
+```typescript
+agentLoop(
+  prompt: UserMessage,
+  context: AgentContext,
+  config: AgentLoopConfig,
+  signal?: AbortSignal
+): EventStream<AgentEvent>
+```
+
+**Key Features:**
+- Event-driven streaming (turn_start, message_*, tool_execution_*, turn_end, agent_end)
+- Tool execution with validation
+- Signal-based cancellation
+- Message queue for injecting out-of-band messages
+- Preprocessor support for message transformation
+
+**Events:**
+```typescript
+type AgentEvent =
+  | { type: "agent_start" }
+  | { type: "turn_start" }
+  | { type: "message_start"; message: Message }
+  | { type: "message_update"; assistantMessageEvent: AssistantMessageEvent; message: AssistantMessage }
+  | { type: "message_end"; message: Message }
+  | { type: "tool_execution_start"; toolCallId: string; toolName: string; args: any }
+  | { type: "tool_execution_end"; toolCallId: string; toolName: string; result: AgentToolResult<any> | string; isError: boolean }
+  | { type: "turn_end"; message: AssistantMessage; toolResults: ToolResultMessage[] }
+  | { type: "agent_end"; messages: Message[] }
+```
+
+**Tool Interface:**
+```typescript
+interface AgentTool<TParameters extends TSchema = TSchema, TDetails = any> extends Tool<TParameters> {
+  label: string;  // Human-readable name for UI
+  execute: (
+    toolCallId: string,
+    params: Static<TParameters>,
+    signal?: AbortSignal
+  ) => Promise<AgentToolResult<TDetails>>;
+}
+
+interface AgentToolResult<T> {
+  output: string;   // Text sent to LLM
+  details: T;       // Structured data for UI rendering
+}
+```
+
+### packages/web-ui/agent - Web Agent
+
+**Status:** ✅ Good for web use cases, keep separate
+
+**Architecture:**
+```typescript
+class Agent {
+  constructor(opts: {
+    initialState?: Partial<AgentState>;
+    debugListener?: (entry: DebugLogEntry) => void;
+    transport: AgentTransport;
+    messageTransformer?: (messages: AppMessage[]) => Message[];
+  })
+
+  async prompt(input: string, attachments?: Attachment[]): Promise<void>
+  abort(): void
+  subscribe(fn: (e: AgentEvent) => void): () => void
+}
+```
+
+**Key Features:**
+- **Transport abstraction** (ProviderTransport for direct API, AppTransport for server-side)
+- **Attachment handling** (images, documents with text extraction)
+- **Message transformation** (app messages → LLM messages)
+- **Reactive state** (subscribe pattern for UI updates)
+- **Message queue** for injecting tool results/errors asynchronously
+
+**Why it's different from coding agent:**
+- Browser-specific concerns (CORS, attachments)
+- Transport layer for flexible API routing
+- Tied to web UI state management
+- Supports rich media attachments
+
+### packages/agent - OLD Implementation
+
+**Status:** ⚠️ MUST BE REPLACED
+
+**Architecture:**
+```typescript
+class Agent {
+  constructor(
+    config: AgentConfig,
+    renderer?: AgentEventReceiver,
+    sessionManager?: SessionManager
+  )
+
+  async ask(userMessage: string): Promise<void>
+  interrupt(): void
+  setEvents(events: AgentEvent[]): void
+}
+```
+
+**Problems:**
+1. **Tightly coupled to OpenAI SDK** (not provider-agnostic)
+2. **Hardcoded tools** (read, list, bash, glob, rg)
+3. **Mixed concerns** (agent logic + tool implementations in same package)
+4. **No separation** between core loop and UI rendering
+5. **Two API paths** (completions vs responses) with branching logic
+
+**Good parts to preserve:**
+1. **SessionManager** - JSONL-based session persistence
+2. **Event receiver pattern** - Clean UI integration
+3. **Abort support** - Proper signal handling
+4. **Renderer abstraction** (ConsoleRenderer, TuiRenderer, JsonRenderer)
+
+**Tools implemented:**
+- `read`: Read file contents (1MB limit with truncation)
+- `list`: List directory contents
+- `bash`: Execute shell command with abort support
+- `glob`: Find files matching glob pattern
+- `rg`: Run ripgrep search
+
+## Proposed Architecture
+
+### Package Structure
+
+```
+pi-mono/
+├── packages/ai/                          # [unchanged] Core streaming
+├── packages/coding-agent/                # [NEW] Headless coding agent
+│   ├── src/
+│   │   ├── agent.ts                      # Main agent class
+│   │   ├── session-manager.ts            # Session persistence
+│   │   ├── tools/
+│   │   │   ├── read-tool.ts              # Read files (with pagination)
+│   │   │   ├── bash-tool.ts              # Shell execution
+│   │   │   ├── edit-tool.ts              # File editing (old_string → new_string)
+│   │   │   ├── write-tool.ts             # File creation/replacement
+│   │   │   └── index.ts                  # Tool exports
+│   │   └── types.ts                      # Public types
+│   └── package.json
+│
+├── packages/coding-agent-tui/            # [NEW] Terminal interface
+│   ├── src/
+│   │   ├── cli.ts                        # CLI entry point
+│   │   ├── renderers/
+│   │   │   ├── tui-renderer.ts           # Rich terminal UI
+│   │   │   ├── console-renderer.ts       # Simple console output
+│   │   │   └── json-renderer.ts          # JSONL output for piping
+│   │   └── main.ts                       # App logic
+│   └── package.json
+│
+├── packages/web-ui/                      # [unchanged] Web UI keeps its own agent
+└── packages/tui/                         # [unchanged] Low-level terminal primitives
+```
+
+### Dependency Graph
+
+```
+┌─────────────────────┐
+│   @mariozechner/    │
+│      pi-ai          │  ← Core streaming, tool interface
+└──────────┬──────────┘
+           │ depends on
+           ↓
+┌─────────────────────┐
+│  @mariozechner/     │
+│   coding-agent      │  ← Headless agent + file tools
+└──────────┬──────────┘
+           │ depends on
+           ↓
+    ┌──────────┬──────────┐
+    ↓          ↓          ↓
+┌────────┐ ┌───────┐ ┌────────┐
+│ TUI    │ │ VSCode│ │ Web UI │
+│ Client │ │  Ext  │ │ (own)  │
+└────────┘ └───────┘ └────────┘
+```
+
+## Package: @mariozechner/coding-agent
+
+### Core Agent Class
+
+```typescript
+export interface CodingAgentConfig {
+  systemPrompt: string;
+  model: Model<any>;
+  reasoning?: "low" | "medium" | "high";
+  apiKey: string;
+}
+
+export interface CodingAgentOptions {
+  config: CodingAgentConfig;
+  sessionManager?: SessionManager;
+  workingDirectory?: string;
+}
+
+export class CodingAgent {
+  constructor(options: CodingAgentOptions);
+
+  // Send a message to the agent
+  async prompt(message: string, signal?: AbortSignal): AsyncIterable<AgentEvent>;
+
+  // Restore from session events (for --continue mode)
+  setMessages(messages: Message[]): void;
+
+  // Get current message history
+  getMessages(): Message[];
+}
+```
+
+**Key design decisions:**
+1. **AsyncIterable instead of callbacks** - More flexible for consumers
+2. **Signal per prompt** - Each prompt() call accepts its own AbortSignal
+3. **No internal state management** - Consumers handle UI state
+4. **Simple message management** - Get/set for session restoration
+
+### Usage Example (TUI)
+
+```typescript
+import { CodingAgent } from "@mariozechner/coding-agent";
+import { SessionManager } from "@mariozechner/coding-agent";
+
+const session = new SessionManager({ continue: true });
+const agent = new CodingAgent({
+  config: {
+    systemPrompt: "You are a coding assistant...",
+    model: getModel("openai", "gpt-4"),
+    apiKey: process.env.OPENAI_API_KEY!,
+  },
+  sessionManager: session,
+  workingDirectory: process.cwd(),
+});
+
+// Restore previous session
+if (session.hasData()) {
+  agent.setMessages(session.getMessages());
+}
+
+// Send prompt with abort support
+const controller = new AbortController();
+for await (const event of agent.prompt("Fix the bug in server.ts", controller.signal)) {
+  switch (event.type) {
+    case "message_update":
+      renderer.updateAssistant(event.message);
+      break;
+    case "tool_execution_start":
+      renderer.showTool(event.toolName, event.args);
+      break;
+    case "tool_execution_end":
+      renderer.showToolResult(event.toolName, event.result);
+      break;
+  }
+}
+```
+
+### Session Manager
+
+```typescript
+export interface SessionManagerOptions {
+  continue?: boolean;           // Resume most recent session
+  directory?: string;            // Custom session directory
+}
+
+export interface SessionMetadata {
+  id: string;
+  timestamp: string;
+  cwd: string;
+  config: CodingAgentConfig;
+}
+
+export interface SessionData {
+  metadata: SessionMetadata;
+  messages: Message[];          // Conversation history
+  totalUsage: TokenUsage;       // Aggregated token usage
+}
+
+export class SessionManager {
+  constructor(options?: SessionManagerOptions);
+
+  // Start a new session (writes metadata)
+  startSession(config: CodingAgentConfig): void;
+
+  // Log an event (appends to JSONL)
+  appendEvent(event: AgentEvent): void;
+
+  // Check if session has existing data
+  hasData(): boolean;
+
+  // Get full session data
+  getData(): SessionData | null;
+
+  // Get just the messages for agent restoration
+  getMessages(): Message[];
+
+  // Get session file path
+  getFilePath(): string;
+
+  // Get session ID
+  getId(): string;
+}
+```
+
+**Session Storage Format (JSONL):**
+```jsonl
+{"type":"session","id":"uuid","timestamp":"2025-10-12T10:00:00Z","cwd":"/path","config":{...}}
+{"type":"event","timestamp":"2025-10-12T10:00:01Z","event":{"type":"turn_start"}}
+{"type":"event","timestamp":"2025-10-12T10:00:02Z","event":{"type":"message_start",...}}
+{"type":"event","timestamp":"2025-10-12T10:00:03Z","event":{"type":"message_end",...}}
+```
+
+**Session File Naming:**
+```
+~/.pi/sessions/--path-to-project--/
+  2025-10-12T10-00-00-000Z_uuid.jsonl
+  2025-10-12T11-30-00-000Z_uuid.jsonl
+```
+
+### Tool: BashTool
+
+```typescript
+export interface BashToolDetails {
+  command: string;
+  exitCode: number;
+  duration: number;  // milliseconds
+}
+
+export const bashToolSchema = Type.Object({
+  command: Type.String({ description: "Shell command to execute" }),
+});
+
+export class BashTool implements AgentTool<typeof bashToolSchema, BashToolDetails> {
+  name = "bash";
+  label = "Execute Shell Command";
+  description = "Execute a bash command in the working directory";
+  parameters = bashToolSchema;
+
+  constructor(private workingDirectory: string);
+
+  async execute(
+    toolCallId: string,
+    params: { command: string },
+    signal?: AbortSignal
+  ): Promise<AgentToolResult<BashToolDetails>> {
+    // Spawn child process with signal support
+    // Capture stdout/stderr
+    // Handle 1MB output limit with truncation
+    // Return structured result
+  }
+}
+```
+
+**Key Features:**
+- Abort support via signal → child process kill
+- 1MB output limit (prevents memory exhaustion)
+- Exit code tracking
+- Working directory context
+
+**Output Format:**
+```typescript
+{
+  output: "stdout:\n<content>\nstderr:\n<content>\nexit code: 0",
+  details: {
+    command: "npm test",
+    exitCode: 0,
+    duration: 1234
+  }
+}
+```
+
+### Tool: ReadTool
+
+```typescript
+export interface ReadToolDetails {
+  filePath: string;
+  totalLines: number;
+  linesRead: number;
+  offset: number;
+  truncated: boolean;
+}
+
+export const readToolSchema = Type.Object({
+  file_path: Type.String({ description: "Path to file to read (relative or absolute)" }),
+  offset: Type.Optional(Type.Number({
+    description: "Line number to start reading from (1-indexed). Omit to read from beginning.",
+    minimum: 1
+  })),
+  limit: Type.Optional(Type.Number({
+    description: "Maximum number of lines to read. Omit to read entire file (max 5000 lines).",
+    minimum: 1,
+    maximum: 5000
+  })),
+});
+
+export class ReadTool implements AgentTool<typeof readToolSchema, ReadToolDetails> {
+  name = "read";
+  label = "Read File";
+  description = "Read file contents. For files >5000 lines, use offset and limit to read in chunks.";
+  parameters = readToolSchema;
+
+  constructor(private workingDirectory: string);
+
+  async execute(
+    toolCallId: string,
+    params: { file_path: string; offset?: number; limit?: number },
+    signal?: AbortSignal
+  ): Promise<AgentToolResult<ReadToolDetails>> {
+    // Resolve file path (relative to workingDirectory)
+    // Count total lines in file
+    // If no offset/limit: read up to 5000 lines, warn if truncated
+    // If offset/limit: read specified range
+    // Format with line numbers (using cat -n style)
+    // Return content + metadata
+  }
+}
+```
+
+**Key Features:**
+- **Full file read**: Up to 5000 lines (warns LLM if truncated)
+- **Ranged read**: Specify offset + limit for large files
+- **Line numbers**: Output formatted like `cat -n` (1-indexed)
+- **Abort support**: Can cancel during large file reads
+- **Metadata**: Total line count, lines read, truncation status
+
+**Output Format (full file):**
+```typescript
+{
+  output: `     1  import { foo } from './foo';
+     2  import { bar } from './bar';
+     3
+     4  export function main() {
+     5    console.log('hello');
+     6  }`,
+  details: {
+    filePath: "src/main.ts",
+    totalLines: 6,
+    linesRead: 6,
+    offset: 0,
+    truncated: false
+  }
+}
+```
+
+**Output Format (large file, truncated):**
+```typescript
+{
+  output: `WARNING: File has 10000 lines, showing first 5000. Use offset and limit parameters to read more.
+
+     1  import { foo } from './foo';
+     2  import { bar } from './bar';
+     ...
+  5000  const x = 42;`,
+  details: {
+    filePath: "src/large.ts",
+    totalLines: 10000,
+    linesRead: 5000,
+    offset: 0,
+    truncated: true
+  }
+}
+```
+
+**Output Format (ranged read):**
+```typescript
+{
+  output: `  1000  function middleware() {
+  1001    return (req, res, next) => {
+  1002      console.log('middleware');
+  1003      next();
+  1004    };
+  1005  }`,
+  details: {
+    filePath: "src/server.ts",
+    totalLines: 10000,
+    linesRead: 6,
+    offset: 1000,
+    truncated: false
+  }
+}
+```
+
+**Error Cases:**
+- File not found → error
+- Offset > total lines → error
+- Binary file detected → error (suggest using bash tool)
+
+**Usage Examples in System Prompt:**
+```
+To read a large file:
+1. read(file_path="src/large.ts") // Gets first 5000 lines + total count
+2. If truncated, read remaining chunks:
+   read(file_path="src/large.ts", offset=5001, limit=5000)
+   read(file_path="src/large.ts", offset=10001, limit=5000)
+```
+
+### Tool: EditTool
+
+```typescript
+export interface EditToolDetails {
+  filePath: string;
+  oldString: string;
+  newString: string;
+  matchCount: number;
+  linesChanged: number;
+}
+
+export const editToolSchema = Type.Object({
+  file_path: Type.String({ description: "Path to file to edit (relative or absolute)" }),
+  old_string: Type.String({ description: "Exact string to find and replace" }),
+  new_string: Type.String({ description: "String to replace with" }),
+});
+
+export class EditTool implements AgentTool<typeof editToolSchema, EditToolDetails> {
+  name = "edit";
+  label = "Edit File";
+  description = "Find and replace exact string in a file";
+  parameters = editToolSchema;
+
+  constructor(private workingDirectory: string);
+
+  async execute(
+    toolCallId: string,
+    params: { file_path: string; old_string: string; new_string: string },
+    signal?: AbortSignal
+  ): Promise<AgentToolResult<EditToolDetails>> {
+    // Resolve file path (relative to workingDirectory)
+    // Read file contents
+    // Find old_string (must be exact match)
+    // Replace with new_string
+    // Write file back
+    // Return stats
+  }
+}
+```
+
+**Key Features:**
+- Exact string matching (no regex)
+- Safe atomic writes (write temp → rename)
+- Abort support (cancel before write)
+- Match validation (error if old_string not found)
+- Line-based change tracking
+
+**Output Format:**
+```typescript
+{
+  output: "Replaced 1 occurrence in src/server.ts (3 lines changed)",
+  details: {
+    filePath: "src/server.ts",
+    oldString: "const port = 3000;",
+    newString: "const port = process.env.PORT || 3000;",
+    matchCount: 1,
+    linesChanged: 3
+  }
+}
+```
+
+**Error Cases:**
+- File not found → error
+- old_string not found → error
+- Multiple matches for old_string → error (ambiguous)
+- File changed during operation → error (race condition)
+
+### Tool: WriteTool
+
+```typescript
+export interface WriteToolDetails {
+  filePath: string;
+  size: number;
+  isNew: boolean;
+}
+
+export const writeToolSchema = Type.Object({
+  file_path: Type.String({ description: "Path to file to create/overwrite" }),
+  content: Type.String({ description: "Full file contents to write" }),
+});
+
+export class WriteTool implements AgentTool<typeof writeToolSchema, WriteToolDetails> {
+  name = "write";
+  label = "Write File";
+  description = "Create a new file or completely replace existing file contents";
+  parameters = writeToolSchema;
+
+  constructor(private workingDirectory: string);
+
+  async execute(
+    toolCallId: string,
+    params: { file_path: string; content: string },
+    signal?: AbortSignal
+  ): Promise<AgentToolResult<WriteToolDetails>> {
+    // Resolve file path
+    // Check if file exists (track isNew)
+    // Create parent directories if needed
+    // Write content atomically
+    // Return stats
+  }
+}
+```
+
+**Key Features:**
+- Creates parent directories automatically
+- Safe atomic writes
+- Abort support
+- No size limits (trust LLM context limits)
+
+**Output Format:**
+```typescript
+{
+  output: "Created new file src/utils/helper.ts (142 bytes)",
+  details: {
+    filePath: "src/utils/helper.ts",
+    size: 142,
+    isNew: true
+  }
+}
+```
+
+## Package: @mariozechner/coding-agent-tui
+
+### CLI Interface
+
+```bash
+# Interactive mode (default)
+coding-agent
+
+# Continue previous session
+coding-agent --continue
+
+# Single-shot mode
+coding-agent "Fix the TypeScript errors"
+
+# Multiple prompts
+coding-agent "Add validation" "Write tests"
+
+# Custom model
+coding-agent --model openai/gpt-4 --api-key $KEY
+
+# JSON output (for piping)
+coding-agent --json < prompts.jsonl > results.jsonl
+```
+
+### Arguments
+
+```typescript
+{
+  "base-url": string;        // API endpoint
+  "api-key": string;         // API key (or env var)
+  "model": string;           // Model identifier
+  "system-prompt": string;   // System prompt
+  "continue": boolean;       // Resume session
+  "json": boolean;           // JSONL I/O mode
+  "help": boolean;           // Show help
+}
+```
+
+### Renderers
+
+**TuiRenderer** - Rich terminal UI
+- Real-time streaming output
+- Syntax highlighting for code
+- Tool execution indicators
+- Progress spinners
+- Token usage stats
+- Keyboard shortcuts (Ctrl+C to abort)
+
+**ConsoleRenderer** - Simple console output
+- Plain text output
+- No ANSI codes
+- Good for logging/CI
+
+**JsonRenderer** - JSONL output
+- One JSON object per line
+- Each line is a complete event
+- For piping/processing
+
+### JSON Mode Example
+
+Input (stdin):
+```jsonl
+{"type":"message","content":"List all TypeScript files"}
+{"type":"interrupt"}
+{"type":"message","content":"Count the files"}
+```
+
+Output (stdout):
+```jsonl
+{"type":"turn_start","timestamp":"..."}
+{"type":"message_start","message":{...}}
+{"type":"tool_execution_start","toolCallId":"...","toolName":"bash","args":"{...}"}
+{"type":"tool_execution_end","toolCallId":"...","result":"..."}
+{"type":"message_end","message":{...}}
+{"type":"turn_end"}
+{"type":"interrupted"}
+{"type":"message_start","message":{...}}
+...
+```
+
+## Integration Patterns
+
+### VS Code Extension
+
+```typescript
+import { CodingAgent, SessionManager } from "@mariozechner/coding-agent";
+import * as vscode from "vscode";
+
+class CodingAgentProvider {
+  private agent: CodingAgent;
+  private outputChannel: vscode.OutputChannel;
+
+  constructor() {
+    const workspaceRoot = vscode.workspace.workspaceFolders?.[0]?.uri.fsPath || process.cwd();
+    const session = new SessionManager({
+      directory: path.join(workspaceRoot, ".vscode", "agent-sessions")
+    });
+
+    this.agent = new CodingAgent({
+      config: {
+        systemPrompt: "You are a coding assistant...",
+        model: getModel("openai", "gpt-4"),
+        apiKey: vscode.workspace.getConfiguration("codingAgent").get("apiKey")!,
+      },
+      sessionManager: session,
+      workingDirectory: workspaceRoot,
+    });
+
+    this.outputChannel = vscode.window.createOutputChannel("Coding Agent");
+  }
+
+  async executePrompt(prompt: string) {
+    const cancellation = new vscode.CancellationTokenSource();
+
+    // Convert VS Code cancellation to AbortSignal
+    const controller = new AbortController();
+    cancellation.token.onCancellationRequested(() => controller.abort());
+
+    for await (const event of this.agent.prompt(prompt, controller.signal)) {
+      switch (event.type) {
+        case "message_update":
+          this.outputChannel.appendLine(event.message.content[0].text);
+          break;
+        case "tool_execution_start":
+          vscode.window.showInformationMessage(`Running: ${event.toolName}`);
+          break;
+        case "tool_execution_end":
+          if (event.isError) {
+            vscode.window.showErrorMessage(`Tool failed: ${event.result}`);
+          }
+          break;
+      }
+    }
+  }
+}
+```
+
+### Headless Server/API
+
+```typescript
+import { CodingAgent } from "@mariozechner/coding-agent";
+import express from "express";
+
+const app = express();
+
+app.post("/api/prompt", async (req, res) => {
+  const { prompt, sessionId } = req.body;
+
+  const agent = new CodingAgent({
+    config: {
+      systemPrompt: "...",
+      model: getModel("openai", "gpt-4"),
+      apiKey: process.env.OPENAI_API_KEY!,
+    },
+    workingDirectory: `/tmp/workspaces/${sessionId}`,
+  });
+
+  // Stream SSE
+  res.setHeader("Content-Type", "text/event-stream");
+
+  const controller = new AbortController();
+  req.on("close", () => controller.abort());
+
+  for await (const event of agent.prompt(prompt, controller.signal)) {
+    res.write(`data: ${JSON.stringify(event)}\n\n`);
+  }
+
+  res.end();
+});
+
+app.listen(3000);
+```
+
+## Migration Plan
+
+### Phase 1: Extract Core Package
+1. Create `packages/coding-agent/` structure
+2. Port SessionManager from old agent package
+3. Implement BashTool, EditTool, WriteTool
+4. Implement CodingAgent class using pi-ai/agentLoop
+5. Write tests for each tool
+6. Write integration tests
+
+### Phase 2: Build TUI
+1. Create `packages/coding-agent-tui/`
+2. Port TuiRenderer from old agent package
+3. Port ConsoleRenderer, JsonRenderer
+4. Implement CLI argument parsing
+5. Implement interactive and single-shot modes
+6. Test session resume functionality
+
+### Phase 3: Update Dependencies
+1. Update web-ui if needed (should be unaffected)
+2. Deprecate old agent package
+3. Update documentation
+4. Update examples
+
+### Phase 4: Future Enhancements
+1. Build VS Code extension
+2. Add more tools (grep, find, etc.) as optional
+3. Plugin system for custom tools
+4. Parallel tool execution
+
+## Open Questions & Decisions
+
+### 1. Should EditTool support multiple replacements?
+
+**Option A:** Error on multiple matches (current proposal)
+- Forces explicit, unambiguous edits
+- LLM must be precise with context
+- Safer (no accidental mass replacements)
+
+**Option B:** Replace all matches
+- More convenient for bulk changes
+- Risk of unintended replacements
+- Need `replace_all: boolean` flag
+
+**Decision:** Start with Option A, add replace_all flag if needed.
+
+### 2. ReadTool line limit and pagination strategy?
+
+**Decision:** 5000 line default limit with offset/limit pagination
+
+**Rationale:**
+- **5000 lines** balances context vs token usage (typical file fits in one read)
+- **Line-based pagination** is intuitive for LLM (matches how humans think about code)
+- **cat -n format** with line numbers helps LLM reference specific lines in edits
+- **Automatic truncation warning** teaches LLM to paginate when needed
+
+**Alternative considered:** Byte-based limits (rejected - harder for LLM to reason about)
+
+**System prompt guidance:**
+```
+When reading large files:
+1. First read without offset/limit to get total line count
+2. If truncated, calculate chunks: ceil(totalLines / 5000)
+3. Read each chunk with appropriate offset
+```
+
+### 3. Should ReadTool handle binary files?
+
+**Decision:** Error on binary files with helpful message
+
+**Error message:**
+```
+Error: Cannot read binary file 'dist/app.js'. Use bash tool if you need to inspect: bash(command="file dist/app.js") or bash(command="xxd dist/app.js | head")
+```
+
+**Rationale:**
+- Binary files are rarely useful to LLM
+- Clear error message teaches LLM to use appropriate tools
+- Prevents token waste on unreadable content
+
+**Binary detection:** Check for null bytes in first 8KB (same strategy as `git diff`)
+
+### 4. Should EditTool support regex?
+
+**Current proposal:** No regex, exact string match only
+
+**Pros of exact match:**
+- Simple implementation
+- No regex escaping issues
+- Clear error messages
+- Safer (no accidental broad matches)
+
+**Cons:**
+- Less powerful
+- Multiple edits needed for patterns
+
+**Decision:** Exact match only. LLM can use bash/sed for complex patterns.
+
+### 5. Working directory enforcement?
+
+**Question:** Should tools be sandboxed to workingDirectory?
+
+**Option A:** Enforce sandbox (only access files under workingDirectory)
+- Safer
+- Prevents accidental system file edits
+- Clear boundaries
+
+**Option B:** Allow any path
+- More flexible
+- LLM can edit config files, etc.
+- User's responsibility to review
+
+**Decision:** Start with Option B (no sandbox). Add `--sandbox` flag later if needed.
+
+### 6. Tool output size limits?
+
+**Current proposal:**
+- ReadTool: 5000 line limit per read (paginate for more)
+- BashTool: 1MB truncation
+- EditTool: No limit (reasonable file sizes expected)
+- WriteTool: No limit (LLM context limited)
+
+**Alternative:** Enforce global 1MB limit on all tool outputs
+
+**Decision:** Per-tool limits. ReadTool and BashTool need it most.
+
+### 7. How to handle long-running bash commands?
+
+**Question:** Should BashTool stream output or wait for completion?
+
+**Option A:** Wait for completion (current proposal)
+- Simpler implementation
+- Full output available for LLM
+- Blocks until done
+
+**Option B:** Stream output
+- Better UX (show progress)
+- More complex (need to handle partial output)
+- LLM sees final output only
+
+**Decision:** Wait for completion initially. Add streaming later if needed.
+
+### 8. Package naming alternatives?
+
+**Current proposal:**
+- `@mariozechner/coding-agent` (core)
+- `@mariozechner/coding-agent-tui` (TUI)
+
+**Alternatives:**
+- `@mariozechner/file-agent` / `@mariozechner/file-agent-tui`
+- `@mariozechner/dev-agent` / `@mariozechner/dev-agent-tui`
+- `@mariozechner/pi-code` / `@mariozechner/pi-code-tui`
+
+**Decision:** `coding-agent` is clear and specific to the use case.
+
+## Summary
+
+This architecture provides:
+
+✅ **Headless core** - Clean separation between agent logic and UI
+✅ **Reusable** - Same agent for TUI, VS Code, web, APIs
+✅ **Composable** - Build on pi-ai primitives
+✅ **Abortable** - First-class cancellation support
+✅ **Session persistence** - Resume conversations seamlessly
+✅ **Focused tools** - read, bash, edit, write (4 tools, no more)
+✅ **Smart pagination** - 5000-line chunks with offset/limit
+✅ **Type-safe** - Full TypeScript with schema validation
+✅ **Testable** - Pure functions, mockable dependencies
+
+The key insight is to **keep web-ui's agent separate** (it has different concerns) while creating a **new focused coding agent** for file manipulation workflows that can be shared across non-web interfaces.