Agent package + coding agent WIP, refactored web-ui prompts

2026-04-17 01:04:36 +00:00 · 2025-10-17 11:47:01 +02:00 · 2025-10-17 11:47:01 +02:00 · ffc9be8867
commit ffc9be8867
parent 4e7a340460
58 changed files with 5138 additions and 2206 deletions
--- a/docs/agent.md
+++ b/docs/agent.md
@ -1,15 +1,23 @@
-# Coding Agent Architecture
+# Agent Architecture

 ## Executive Summary

-This document proposes extracting the agent infrastructure from `@mariozechner/pi-web-ui` and `@mariozechner/pi-agent` into a new headless coding agent package that can be reused across multiple UI implementations (TUI, VS Code extension, web interface).
+This document proposes extracting the agent infrastructure from `@mariozechner/pi-web-ui` into two new packages:
+
+1. **`@mariozechner/agent`** - General-purpose agent package with transport abstraction, state management, and attachment support
+2. **`@mariozechner/coding-agent`** - Specialized coding agent built on the general agent, with file manipulation tools and session management

 The new architecture will provide:
- **Headless agent core** with file manipulation tools (read, bash, edit, write)
- **Session management** for conversation persistence and resume capability
+- **General agent core** with transport abstraction (ProviderTransport, AppTransport)
+- **Reactive state management** with subscribe/emit pattern
+- **Attachment support** (type definitions only - processing stays in consumers)
+- **Message transformation** pipeline for filtering and adapting messages
+- **Message queueing** for out-of-band message injection
 - **Full abort support** throughout the execution pipeline
 - **Event-driven API** for flexible UI integration
 - **Clean separation** between agent logic and presentation layer
+- **Coding-specific tools** (read, bash, edit, write) in a specialized package
+- **Session management** for conversation persistence and resume capability

 ## Current Architecture Analysis

@ -18,7 +26,7 @@ The new architecture will provide:
 ```
 pi-mono/
 ├── packages/ai/              # Core AI streaming (GOOD - keep as-is)
-├── packages/web-ui/          # Web UI with agent (GOOD - keep separate)
+├── packages/web-ui/          # Web UI with embedded agent (EXTRACT core agent logic)
 ├── packages/agent/           # OLD - needs to be replaced
 ├── packages/tui/             # Terminal UI lib (GOOD - low-level primitives)
 ├── packages/proxy/           # CORS proxy (unrelated)
@ -77,9 +85,9 @@ interface AgentToolResult<T> {
 }
 ```

-### packages/web-ui/agent - Web Agent
+### packages/web-ui/src/agent - Web Agent

-**Status:** ✅ Good for web use cases, keep separate
+**Status:** ✅ KEEP AS-IS for now, will be replaced later after new packages are proven

 **Architecture:**
 ```typescript
@ -94,60 +102,49 @@ class Agent {
  async prompt(input: string, attachments?: Attachment[]): Promise<void>
  abort(): void
  subscribe(fn: (e: AgentEvent) => void): () => void
+  setSystemPrompt(v: string): void
+  setModel(m: Model<any>): void
+  setThinkingLevel(l: ThinkingLevel): void
+  setTools(t: AgentTool<any>[]): void
+  replaceMessages(ms: AppMessage[]): void
+  appendMessage(m: AppMessage): void
+  async queueMessage(m: AppMessage): Promise<void>
+  clearMessages(): void
 }
 ```

-**Key Features:**
- **Transport abstraction** (ProviderTransport for direct API, AppTransport for server-side)
- **Attachment handling** (images, documents with text extraction)
- **Message transformation** (app messages → LLM messages)
- **Reactive state** (subscribe pattern for UI updates)
- **Message queue** for injecting tool results/errors asynchronously
+**Key Features (will be basis for new `@mariozechner/agent` package):**
+- ✅ **Transport abstraction** (ProviderTransport for direct API, AppTransport for server-side proxy)
+- ✅ **Attachment type definition** (id, type, fileName, mimeType, size, content, extractedText, preview)
+- ✅ **Message transformation** pipeline (app messages → LLM messages, with filtering)
+- ✅ **Reactive state** (subscribe/emit pattern for UI updates)
+- ✅ **Message queueing** for injecting messages out-of-band during agent loop
+- ✅ **Abort support** (AbortController per prompt)
+- ✅ **State management** (systemPrompt, model, thinkingLevel, tools, messages, isStreaming, etc.)

-**Why it's different from coding agent:**
- Browser-specific concerns (CORS, attachments)
- Transport layer for flexible API routing
- Tied to web UI state management
- Supports rich media attachments
+**Strategy:**
+1. Use this implementation as the **reference design** for `@mariozechner/agent`
+2. Create new `@mariozechner/agent` package by copying/adapting this code
+3. Keep web-ui using its own embedded agent until new package is proven stable
+4. Eventually migrate web-ui to use `@mariozechner/agent` (Phase 2 of migration)
+5. Document processing (PDF/DOCX/PPTX/Excel) stays in web-ui permanently

 ### packages/agent - OLD Implementation

-**Status:** ⚠️ MUST BE REPLACED
+**Status:** ⚠️ REMOVE COMPLETELY

-**Architecture:**
-```typescript
-class Agent {
-  constructor(
-    config: AgentConfig,
-    renderer?: AgentEventReceiver,
-    sessionManager?: SessionManager
-  )
+**Why it should be removed:**
+1. **Tightly coupled to OpenAI SDK** - Not provider-agnostic, hardcoded to OpenAI's API
+2. **Outdated architecture** - Superseded by web-ui's better agent design
+3. **Mixed concerns** - Agent logic + tool implementations + rendering all in one package
+4. **Limited scope** - Cannot be reused across different UI implementations

-  async ask(userMessage: string): Promise<void>
-  interrupt(): void
-  setEvents(events: AgentEvent[]): void
-}
-```
+**What to salvage before removal:**
+1. **SessionManager** - Port to `@mariozechner/coding-agent` (JSONL-based session persistence)
+2. **Tool implementations** - Adapt read, bash, edit, write tools for coding-agent
+3. **Renderer abstractions** - Port TuiRenderer/ConsoleRenderer/JsonRenderer concepts to coding-agent-tui

-**Problems:**
-1. **Tightly coupled to OpenAI SDK** (not provider-agnostic)
-2. **Hardcoded tools** (read, list, bash, glob, rg)
-3. **Mixed concerns** (agent logic + tool implementations in same package)
-4. **No separation** between core loop and UI rendering
-5. **Two API paths** (completions vs responses) with branching logic
-
-**Good parts to preserve:**
-1. **SessionManager** - JSONL-based session persistence
-2. **Event receiver pattern** - Clean UI integration
-3. **Abort support** - Proper signal handling
-4. **Renderer abstraction** (ConsoleRenderer, TuiRenderer, JsonRenderer)
-
-**Tools implemented:**
- `read`: Read file contents (1MB limit with truncation)
- `list`: List directory contents
- `bash`: Execute shell command with abort support
- `glob`: Find files matching glob pattern
- `rg`: Run ripgrep search
+**Action:** Delete this package entirely after extracting useful components to the new packages.

 ## Proposed Architecture

@ -155,132 +152,440 @@ class Agent {

 ```
 pi-mono/
-├── packages/ai/                          # [unchanged] Core streaming
-├── packages/coding-agent/                # [NEW] Headless coding agent
+├── packages/ai/                          # [unchanged] Core streaming library
+│
+├── packages/agent/                       # [NEW] General-purpose agent
 │   ├── src/
-│   │   ├── agent.ts                      # Main agent class
-│   │   ├── session-manager.ts            # Session persistence
+│   │   ├── agent.ts                      # Main Agent class
+│   │   ├── types.ts                      # AgentState, AgentEvent, Attachment, etc.
+│   │   ├── transports/
+│   │   │   ├── types.ts                  # AgentTransport interface
+│   │   │   ├── ProviderTransport.ts      # Direct API calls
+│   │   │   ├── AppTransport.ts           # Server-side proxy
+│   │   │   ├── proxy-types.ts            # Proxy event types
+│   │   │   └── index.ts                  # Transport exports
+│   │   └── index.ts                      # Public API
+│   └── package.json
+│
+├── packages/coding-agent/                # [NEW] Coding-specific agent + CLI
+│   ├── src/
+│   │   ├── coding-agent.ts               # CodingAgent wrapper (uses @mariozechner/agent)
+│   │   ├── session-manager.ts            # Session persistence (JSONL)
 │   │   ├── tools/
 │   │   │   ├── read-tool.ts              # Read files (with pagination)
 │   │   │   ├── bash-tool.ts              # Shell execution
 │   │   │   ├── edit-tool.ts              # File editing (old_string → new_string)
 │   │   │   ├── write-tool.ts             # File creation/replacement
 │   │   │   └── index.ts                  # Tool exports
-│   │   └── types.ts                      # Public types
-│   └── package.json
+│   │   ├── cli/
+│   │   │   ├── index.ts                  # CLI entry point
+│   │   │   ├── renderers/
+│   │   │   │   ├── tui-renderer.ts       # Rich terminal UI
+│   │   │   │   ├── console-renderer.ts   # Simple console output
+│   │   │   │   └── json-renderer.ts      # JSONL output for piping
+│   │   │   └── main.ts                   # CLI app logic
+│   │   ├── types.ts                      # Public types
+│   │   └── index.ts                      # Public API (agent + tools)
+│   └── package.json                      # Exports both library + CLI binary
 │
-├── packages/coding-agent-tui/            # [NEW] Terminal interface
+├── packages/web-ui/                      # [updated] Uses @mariozechner/agent
 │   ├── src/
-│   │   ├── cli.ts                        # CLI entry point
-│   │   ├── renderers/
-│   │   │   ├── tui-renderer.ts           # Rich terminal UI
-│   │   │   ├── console-renderer.ts       # Simple console output
-│   │   │   └── json-renderer.ts          # JSONL output for piping
-│   │   └── main.ts                       # App logic
-│   └── package.json
+│   │   ├── utils/
+│   │   │   └── attachment-utils.ts       # Document processing (keep here)
+│   │   └── ...                           # Other web UI code
+│   └── package.json                      # Now depends on @mariozechner/agent
 │
-├── packages/web-ui/                      # [unchanged] Web UI keeps its own agent
 └── packages/tui/                         # [unchanged] Low-level terminal primitives
 ```

 ### Dependency Graph

 ```
-┌─────────────────────┐
-│   @mariozechner/    │
-│      pi-ai          │  ← Core streaming, tool interface
-└──────────┬──────────┘
-           │ depends on
-           ↓
-┌─────────────────────┐
-│  @mariozechner/     │
-│   coding-agent      │  ← Headless agent + file tools
-└──────────┬──────────┘
-           │ depends on
-           ↓
-    ┌──────────┬──────────┐
-    ↓          ↓          ↓
-┌────────┐ ┌───────┐ ┌────────┐
-│ TUI    │ │ VSCode│ │ Web UI │
-│ Client │ │  Ext  │ │ (own)  │
-└────────┘ └───────┘ └────────┘
+                    ┌─────────────────────┐
+                    │   @mariozechner/    │
+                    │      pi-ai          │  ← Core streaming, tool interface
+                    └──────────┬──────────┘
+                               │ depends on
+                               ↓
+                    ┌─────────────────────┐
+                    │  @mariozechner/     │
+                    │      agent          │  ← General agent (transports, state, attachments)
+                    └──────────┬──────────┘
+                               │ depends on
+                               ↓
+               ┌───────────────┴───────────────┐
+               ↓                               ↓
+    ┌─────────────────────┐         ┌─────────────────────┐
+    │  @mariozechner/     │         │  @mariozechner/     │
+    │   coding-agent      │         │     pi-web-ui       │
+    │ (lib + CLI + tools) │         │ (+ doc processing)  │
+    └─────────────────────┘         └─────────────────────┘
+```
+
+## Package: @mariozechner/agent
+
+### Core Types
+
+```typescript
+export interface Attachment {
+  id: string;
+  type: "image" | "document";
+  fileName: string;
+  mimeType: string;
+  size: number;
+  content: string;        // base64 encoded (without data URL prefix)
+  extractedText?: string; // For documents
+  preview?: string;       // base64 image preview
+}
+
+export type ThinkingLevel = "off" | "minimal" | "low" | "medium" | "high";
+
+// AppMessage abstraction - extends base LLM messages with app-specific features
+export type UserMessageWithAttachments = UserMessage & { attachments?: Attachment[] };
+
+// Extensible interface for custom app messages (via declaration merging)
+// Apps can add their own message types:
+// declare module "@mariozechner/agent" {
+//   interface CustomMessages {
+//     artifact: ArtifactMessage;
+//     notification: NotificationMessage;
+//   }
+// }
+export interface CustomMessages {
+  // Empty by default - apps extend via declaration merging
+}
+
+// AppMessage: Union of LLM messages + attachments + custom messages
+export type AppMessage =
+  | AssistantMessage
+  | UserMessageWithAttachments
+  | ToolResultMessage
+  | CustomMessages[keyof CustomMessages];
+
+export interface AgentState {
+  systemPrompt: string;
+  model: Model<any>;
+  thinkingLevel: ThinkingLevel;
+  tools: AgentTool<any>[];
+  messages: AppMessage[];      // Can include attachments + custom message types
+  isStreaming: boolean;
+  streamMessage: Message | null;
+  pendingToolCalls: Set<string>;
+  error?: string;
+}
+
+export type AgentEvent =
+  | { type: "state-update"; state: AgentState }
+  | { type: "started" }
+  | { type: "completed" };
+```
+
+### AppMessage Abstraction
+
+The `AppMessage` type is a key abstraction that extends base LLM messages with app-specific features while maintaining type safety and extensibility.
+
+**Key Benefits:**
+1. **Extends base messages** - Adds `attachments` field to `UserMessage` for file uploads
+2. **Type-safe extensibility** - Apps can add custom message types via declaration merging
+3. **Backward compatible** - Works seamlessly with base LLM messages from `@mariozechner/pi-ai`
+4. **Message transformation** - Filters app-specific fields before sending to LLM
+
+**Usage Example (Web UI):**
+```typescript
+import type { AppMessage } from "@mariozechner/agent";
+
+// Extend with custom message type for artifacts
+declare module "@mariozechner/agent" {
+  interface CustomMessages {
+    artifact: ArtifactMessage;
+  }
+}
+
+interface ArtifactMessage {
+  role: "artifact";
+  action: "create" | "update" | "delete";
+  filename: string;
+  content?: string;
+  title?: string;
+  timestamp: string;
+}
+
+// Now AppMessage includes: AssistantMessage | UserMessageWithAttachments | ToolResultMessage | ArtifactMessage
+const messages: AppMessage[] = [
+  { role: "user", content: "Hello", attachments: [attachment] },
+  { role: "assistant", content: [{ type: "text", text: "Hi!" }], /* ... */ },
+  { role: "artifact", action: "create", filename: "test.ts", content: "...", timestamp: "..." }
+];
+```
+
+**Usage Example (Coding Agent):**
+```typescript
+import type { AppMessage } from "@mariozechner/agent";
+
+// Coding agent can extend with session metadata
+declare module "@mariozechner/agent" {
+  interface CustomMessages {
+    session_metadata: SessionMetadataMessage;
+  }
+}
+
+interface SessionMetadataMessage {
+  role: "session_metadata";
+  sessionId: string;
+  timestamp: string;
+  workingDirectory: string;
+}
+```
+
+**Message Transformation:**
+
+The `messageTransformer` function converts app messages to LLM-compatible messages, including handling attachments:
+
+```typescript
+function defaultMessageTransformer(messages: AppMessage[]): Message[] {
+  return messages
+    .filter((m) => {
+      // Only keep standard LLM message roles
+      return m.role === "user" || m.role === "assistant" || m.role === "toolResult";
+    })
+    .map((m) => {
+      if (m.role === "user") {
+        const { attachments, ...baseMessage } = m as any;
+
+        // If no attachments, return as-is
+        if (!attachments || attachments.length === 0) {
+          return baseMessage as Message;
+        }
+
+        // Convert attachments to content blocks
+        const content = Array.isArray(baseMessage.content)
+          ? [...baseMessage.content]
+          : [{ type: "text", text: baseMessage.content }];
+
+        for (const attachment of attachments) {
+          // Add image blocks for image attachments
+          if (attachment.type === "image") {
+            content.push({
+              type: "image",
+              data: attachment.content,
+              mimeType: attachment.mimeType
+            });
+          }
+          // Add text blocks for documents with extracted text
+          else if (attachment.type === "document" && attachment.extractedText) {
+            content.push({
+              type: "text",
+              text: attachment.extractedText
+            });
+          }
+        }
+
+        return { ...baseMessage, content } as Message;
+      }
+      return m as Message;
+    });
+}
+```
+
+This ensures that:
+- Custom message types (like `artifact`, `session_metadata`) are filtered out
+- Image attachments are converted to `ImageContent` blocks
+- Document attachments with extracted text are converted to `TextContent` blocks
+- The `attachments` field itself is stripped (replaced by proper content blocks)
+- LLM receives only standard `Message` types from `@mariozechner/pi-ai`
+
+### Agent Class
+
+```typescript
+export interface AgentOptions {
+  initialState?: Partial<AgentState>;
+  transport: AgentTransport;
+  // Transform app messages to LLM-compatible messages before sending
+  messageTransformer?: (messages: AppMessage[]) => Message[] | Promise<Message[]>;
+}
+
+export class Agent {
+  constructor(opts: AgentOptions);
+
+  get state(): AgentState;
+  subscribe(fn: (e: AgentEvent) => void): () => void;
+
+  // State mutators
+  setSystemPrompt(v: string): void;
+  setModel(m: Model<any>): void;
+  setThinkingLevel(l: ThinkingLevel): void;
+  setTools(t: AgentTool<any>[]): void;
+  replaceMessages(ms: AppMessage[]): void;
+  appendMessage(m: AppMessage): void;
+  async queueMessage(m: AppMessage): Promise<void>;
+  clearMessages(): void;
+
+  // Main prompt method
+  async prompt(input: string, attachments?: Attachment[]): Promise<void>;
+
+  // Abort current operation
+  abort(): void;
+}
+```
+
+**Key Features:**
+1. **Reactive state** - Subscribe to state updates for UI binding
+2. **Transport abstraction** - Pluggable backends (direct API, proxy server, etc.)
+3. **Message transformation** - Convert app-specific messages to LLM format
+4. **Message queueing** - Inject messages during agent loop (for tool results, errors)
+5. **Attachment support** - Type-safe attachment handling (processing is external)
+6. **Abort support** - Cancel in-progress operations
+
+### Transport Interface
+
+```typescript
+export interface AgentRunConfig {
+  systemPrompt: string;
+  tools: AgentTool<any>[];
+  model: Model<any>;
+  reasoning?: "low" | "medium" | "high";
+  getQueuedMessages?: <T>() => Promise<QueuedMessage<T>[]>;
+}
+
+export interface AgentTransport {
+  run(
+    messages: Message[],
+    userMessage: Message,
+    config: AgentRunConfig,
+    signal?: AbortSignal,
+  ): AsyncIterable<AgentEvent>;
+}
+```
+
+### ProviderTransport
+
+```typescript
+export class ProviderTransport implements AgentTransport {
+  async *run(messages: Message[], userMessage: Message, cfg: AgentRunConfig, signal?: AbortSignal) {
+    // Calls LLM providers directly using agentLoop from @mariozechner/pi-ai
+    // Optionally routes through CORS proxy if configured
+  }
+}
+```
+
+### AppTransport
+
+```typescript
+export class AppTransport implements AgentTransport {
+  constructor(proxyUrl: string);
+
+  async *run(messages: Message[], userMessage: Message, cfg: AgentRunConfig, signal?: AbortSignal) {
+    // Routes requests through app server with user authentication
+    // Server manages API keys and usage tracking
+  }
+}
 ```

 ## Package: @mariozechner/coding-agent

-### Core Agent Class
+### CodingAgent Class

 ```typescript
-export interface CodingAgentConfig {
+export interface CodingAgentOptions {
  systemPrompt: string;
  model: Model<any>;
  reasoning?: "low" | "medium" | "high";
  apiKey: string;
-}
-
-export interface CodingAgentOptions {
-  config: CodingAgentConfig;
-  sessionManager?: SessionManager;
  workingDirectory?: string;
+  sessionManager?: SessionManager;
 }

 export class CodingAgent {
  constructor(options: CodingAgentOptions);

+  // Access underlying agent
+  get agent(): Agent;
+
+  // State accessors
+  get state(): AgentState;
+  subscribe(fn: (e: AgentEvent) => void): () => void;
+
  // Send a message to the agent
-  async prompt(message: string, signal?: AbortSignal): AsyncIterable<AgentEvent>;
+  async prompt(message: string, attachments?: Attachment[]): Promise<void>;

-  // Restore from session events (for --continue mode)
-  setMessages(messages: Message[]): void;
+  // Abort current operation
+  abort(): void;

-  // Get current message history
-  getMessages(): Message[];
+  // Message management for session restoration
+  replaceMessages(messages: AppMessage[]): void;
+  getMessages(): AppMessage[];
 }
 ```

 **Key design decisions:**
-1. **AsyncIterable instead of callbacks** - More flexible for consumers
-2. **Signal per prompt** - Each prompt() call accepts its own AbortSignal
-3. **No internal state management** - Consumers handle UI state
-4. **Simple message management** - Get/set for session restoration
+1. **Wraps @mariozechner/agent** - Builds on the general agent package
+2. **Pre-configured tools** - Includes read, bash, edit, write tools
+3. **Session management** - Optional JSONL-based session persistence
+4. **Working directory context** - All file operations relative to this directory
+5. **Simple API** - Hides transport complexity, uses ProviderTransport by default

 ### Usage Example (TUI)

 ```typescript
 import { CodingAgent } from "@mariozechner/coding-agent";
 import { SessionManager } from "@mariozechner/coding-agent";
+import { getModel } from "@mariozechner/pi-ai";

 const session = new SessionManager({ continue: true });
 const agent = new CodingAgent({
-  config: {
-    systemPrompt: "You are a coding assistant...",
-    model: getModel("openai", "gpt-4"),
-    apiKey: process.env.OPENAI_API_KEY!,
-  },
-  sessionManager: session,
+  systemPrompt: "You are a coding assistant...",
+  model: getModel("openai", "gpt-4"),
+  apiKey: process.env.OPENAI_API_KEY!,
  workingDirectory: process.cwd(),
+  sessionManager: session,
 });

 // Restore previous session
 if (session.hasData()) {
-  agent.setMessages(session.getMessages());
+  agent.replaceMessages(session.getMessages());
 }

-// Send prompt with abort support
-const controller = new AbortController();
-for await (const event of agent.prompt("Fix the bug in server.ts", controller.signal)) {
-  switch (event.type) {
-    case "message_update":
-      renderer.updateAssistant(event.message);
-      break;
-    case "tool_execution_start":
-      renderer.showTool(event.toolName, event.args);
-      break;
-    case "tool_execution_end":
-      renderer.showToolResult(event.toolName, event.result);
-      break;
+// Subscribe to state changes
+agent.subscribe((event) => {
+  if (event.type === "state-update") {
+    renderer.render(event.state);
+  } else if (event.type === "completed") {
+    session.save(agent.getMessages());
  }
-}
+});
+
+// Send prompt
+await agent.prompt("Fix the bug in server.ts");
+```
+
+### Usage Example (Web UI)
+
+```typescript
+import { Agent, ProviderTransport, Attachment } from "@mariozechner/agent";
+import { getModel } from "@mariozechner/pi-ai";
+import { loadAttachment } from "./utils/attachment-utils"; // Web UI keeps this
+
+const agent = new Agent({
+  transport: new ProviderTransport(),
+  initialState: {
+    systemPrompt: "You are a helpful assistant...",
+    model: getModel("google", "gemini-2.5-flash"),
+    thinkingLevel: "low",
+    tools: [],
+  },
+});
+
+// Subscribe to state changes for UI updates
+agent.subscribe((event) => {
+  if (event.type === "state-update") {
+    updateUI(event.state);
+  }
+});
+
+// Handle file upload and send prompt
+const file = await fileInput.files[0];
+const attachment = await loadAttachment(file); // Processes PDF/DOCX/etc
+await agent.prompt("Analyze this document", [attachment]);
 ```

 ### Session Manager
@ -300,8 +605,7 @@ export interface SessionMetadata {

 export interface SessionData {
  metadata: SessionMetadata;
-  messages: Message[];          // Conversation history
-  totalUsage: TokenUsage;       // Aggregated token usage
+  messages: AppMessage[];       // Conversation history
 }

 export class SessionManager {
@ -310,8 +614,8 @@ export class SessionManager {
  // Start a new session (writes metadata)
  startSession(config: CodingAgentConfig): void;

-  // Log an event (appends to JSONL)
-  appendEvent(event: AgentEvent): void;
+  // Append a message to the session (appends to JSONL)
+  appendMessage(message: AppMessage): void;

  // Check if session has existing data
  hasData(): boolean;
@ -320,7 +624,7 @@ export class SessionManager {
  getData(): SessionData | null;

  // Get just the messages for agent restoration
-  getMessages(): Message[];
+  getMessages(): AppMessage[];

  // Get session file path
  getFilePath(): string;
@ -332,12 +636,19 @@ export class SessionManager {

 **Session Storage Format (JSONL):**
 ```jsonl
-{"type":"session","id":"uuid","timestamp":"2025-10-12T10:00:00Z","cwd":"/path","config":{...}}
-{"type":"event","timestamp":"2025-10-12T10:00:01Z","event":{"type":"turn_start"}}
-{"type":"event","timestamp":"2025-10-12T10:00:02Z","event":{"type":"message_start",...}}
-{"type":"event","timestamp":"2025-10-12T10:00:03Z","event":{"type":"message_end",...}}
+{"type":"metadata","id":"uuid","timestamp":"2025-10-12T10:00:00Z","cwd":"/path","config":{...}}
+{"type":"message","message":{"role":"user","content":"Fix the bug in server.ts"}}
+{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"I'll help..."}],...}}
+{"type":"message","message":{"role":"toolResult","toolCallId":"call_123","output":"..."}}
+{"type":"message","message":{"role":"assistant","content":[{"type":"text","text":"Fixed!"}],...}}
 ```

+**How it works:**
+- First line is session metadata (id, timestamp, working directory, config)
+- Each subsequent line is an `AppMessage` from `agent.state.messages`
+- Messages are appended as they're added to the agent state (append-only)
+- On session restore, read all message lines to reconstruct conversation history
+
 **Session File Naming:**
 ```
 ~/.pi/sessions/--path-to-project--/
@ -643,9 +954,11 @@ export class WriteTool implements AgentTool<typeof writeToolSchema, WriteToolDet
 }
 ```

-## Package: @mariozechner/coding-agent-tui
+## CLI Interface (included in @mariozechner/coding-agent)

-### CLI Interface
+The coding-agent package includes both a library and a CLI interface in one package.
+
+### CLI Usage

 ```bash
 # Interactive mode (default)
@ -667,7 +980,7 @@ coding-agent --model openai/gpt-4 --api-key $KEY
 coding-agent --json < prompts.jsonl > results.jsonl
 ```

-### Arguments
+### CLI Arguments

 ```typescript
 {
@ -818,33 +1131,58 @@ app.listen(3000);

 ## Migration Plan

-### Phase 1: Extract Core Package
+### Phase 1: Create General Agent Package
+1. Create `packages/agent/` structure
+2. **COPY** Agent class from web-ui/src/agent/agent.ts (don't extract yet)
+3. Copy types (AgentState, AgentEvent, Attachment, DebugLogEntry, ThinkingLevel)
+4. Copy transports (types.ts, ProviderTransport.ts, AppTransport.ts, proxy-types.ts)
+5. Adapt code to work as standalone package
+6. Write unit tests for Agent class
+7. Write tests for both transports
+8. Publish `@mariozechner/agent@0.1.0`
+9. **Keep web-ui unchanged** - it continues using its embedded agent
+
+### Phase 2: Create Coding Agent Package (with CLI)
 1. Create `packages/coding-agent/` structure
 2. Port SessionManager from old agent package
-3. Implement BashTool, EditTool, WriteTool
-4. Implement CodingAgent class using pi-ai/agentLoop
-5. Write tests for each tool
-6. Write integration tests
+3. Implement ReadTool, BashTool, EditTool, WriteTool
+4. Implement CodingAgent class (wraps @mariozechner/agent)
+5. Implement CLI in `src/cli/` directory:
+   - CLI entry point (index.ts)
+   - TuiRenderer, ConsoleRenderer, JsonRenderer
+   - Argument parsing
+   - Interactive and single-shot modes
+6. Write tests for tools and agent
+7. Write integration tests for CLI
+8. Publish `@mariozechner/coding-agent@0.1.0` (includes library + CLI binary)

-### Phase 2: Build TUI
-1. Create `packages/coding-agent-tui/`
-2. Port TuiRenderer from old agent package
-3. Port ConsoleRenderer, JsonRenderer
-4. Implement CLI argument parsing
-5. Implement interactive and single-shot modes
-6. Test session resume functionality
+### Phase 3: Prove Out New Packages
+1. Use coding-agent (library + CLI) extensively
+2. Fix bugs and iterate on API design
+3. Gather feedback from real usage
+4. Ensure stability and performance

-### Phase 3: Update Dependencies
-1. Update web-ui if needed (should be unaffected)
-2. Deprecate old agent package
-3. Update documentation
-4. Update examples
+### Phase 4: Migrate Web UI (OPTIONAL, later)
+1. Once new `@mariozechner/agent` is proven stable
+2. Update web-ui package.json to depend on `@mariozechner/agent`
+3. Remove src/agent/agent.ts, src/agent/types.ts, src/agent/transports/
+4. Keep src/utils/attachment-utils.ts (document processing)
+5. Update imports to use `@mariozechner/agent`
+6. Test that web UI still works correctly
+7. Verify document attachments (PDF, DOCX, etc.) still work

-### Phase 4: Future Enhancements
-1. Build VS Code extension
-2. Add more tools (grep, find, etc.) as optional
+### Phase 5: Cleanup
+1. Deprecate/remove old `packages/agent/` package
+2. Update all documentation
+3. Create migration guide
+4. Add examples for all use cases
+
+### Phase 6: Future Enhancements
+1. Build VS Code extension using `@mariozechner/coding-agent`
+2. Add more tools (grep, find, glob, etc.) as optional plugins
 3. Plugin system for custom tools
 4. Parallel tool execution
+5. Streaming tool output for long-running commands

 ## Open Questions & Decisions

@ -975,14 +1313,29 @@ Error: Cannot read binary file 'dist/app.js'. Use bash tool if you need to inspe

 This architecture provides:

-✅ **Headless core** - Clean separation between agent logic and UI
-✅ **Reusable** - Same agent for TUI, VS Code, web, APIs
-✅ **Composable** - Build on pi-ai primitives
-✅ **Abortable** - First-class cancellation support
-✅ **Session persistence** - Resume conversations seamlessly
+### General Agent Package (`@mariozechner/agent`)
+✅ **Transport abstraction** - Pluggable backends (ProviderTransport, AppTransport)
+✅ **Reactive state** - Subscribe/emit pattern for UI binding
+✅ **Message transformation** - Flexible pipeline for message filtering/adaptation
+✅ **Message queueing** - Out-of-band message injection during agent loop
+✅ **Attachment support** - Type-safe attachment handling (processing is external)
+✅ **Abort support** - First-class cancellation with AbortController
+✅ **Provider agnostic** - Works with any LLM provider via @mariozechner/pi-ai
+✅ **Type-safe** - Full TypeScript with proper types
+
+### Coding Agent Package (`@mariozechner/coding-agent`)
+✅ **Builds on general agent** - Leverages transport abstraction and state management
+✅ **Session persistence** - JSONL-based session storage and resume
 ✅ **Focused tools** - read, bash, edit, write (4 tools, no more)
-✅ **Smart pagination** - 5000-line chunks with offset/limit
-✅ **Type-safe** - Full TypeScript with schema validation
+✅ **Smart pagination** - 5000-line chunks with offset/limit for ReadTool
+✅ **Working directory context** - All tools operate relative to project root
+✅ **Simple API** - Hides complexity, easy to use
 ✅ **Testable** - Pure functions, mockable dependencies

-The key insight is to **keep web-ui's agent separate** (it has different concerns) while creating a **new focused coding agent** for file manipulation workflows that can be shared across non-web interfaces.
+### Key Architectural Insights
+1. **Extract, don't rewrite** - The web-ui agent is well-designed; extract it into a general package
+2. **Separation of concerns** - Document processing (PDF/DOCX/etc.) stays in web-ui, only type definitions move to general agent
+3. **Layered architecture** - pi-ai → agent → coding-agent → coding-agent-tui
+4. **Reusable across UIs** - Web UI and coding agent both use the same general agent package
+5. **Pluggable transports** - Easy to add new backends (local API, proxy server, etc.)
+6. **Attachment flexibility** - Type is defined centrally, processing is done by consumers