Agent package + coding agent WIP, refactored web-ui prompts

2026-04-17 07:03:25 +00:00 · 2025-10-17 11:47:01 +02:00 · 2025-10-17 11:47:01 +02:00 · ffc9be8867
commit ffc9be8867
parent 4e7a340460
58 changed files with 5138 additions and 2206 deletions
--- a/packages/agent/README.md
+++ b/packages/agent/README.md
@ -1,432 +0,0 @@
-# pi-agent
-
-A general-purpose agent with tool calling and session persistence, modeled after Claude Code but extremely hackable and minimal. It comes with a built-in TUI (also modeled after Claude Code) for interactive use.
-
-Everything is designed to be easy:
- Writing custom UIs on top of it (via JSON mode in any language or the TypeScript API)
- Using it for inference steps in deterministic programs (via JSON mode in any language or the TypeScript API)
- Providing your own system prompts and tools
- Working with various LLM providers or self-hosted LLMs
-
-## Installation
-
-```bash
-npm install -g @mariozechner/pi-agent
-```
-
-This installs the `pi-agent` command globally.
-
-## Quick Start
-
-By default, pi-agent uses OpenAI's API with model `gpt-5-mini` and authenticates using the `OPENAI_API_KEY` environment variable. Any OpenAI-compatible endpoint works, including Ollama, vLLM, OpenRouter, Groq, Anthropic, etc.
-
-```bash
-# Single message
-pi-agent "What is 2+2?"
-
-# Multiple messages processed sequentially
-pi-agent "What is 2+2?" "What about 3+3?"
-
-# Interactive chat mode (no messages = interactive)
-pi-agent
-
-# Continue most recently modified session in current directory
-pi-agent --continue "Follow up question"
-
-# GPT-OSS via Groq
-pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY --model openai/gpt-oss-120b
-
-# GLM 4.5 via OpenRouter
-pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY --model z-ai/glm-4.5
-
-# Claude via Anthropic's OpenAI compatibility layer. See: https://docs.anthropic.com/en/api/openai-sdk
-pi-agent --base-url https://api.anthropic.com/v1 --api-key $ANTHROPIC_API_KEY --model claude-opus-4-1-20250805
-
-# Gemini via Google AI
-pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ --api-key $GEMINI_API_KEY --model gemini-2.5-flash
-```
-
-## Usage Modes
-
-### Single-Shot Mode
-Process one or more messages and exit:
-```bash
-pi-agent "First question" "Second question"
-```
-
-### Interactive Mode
-Start an interactive chat session:
-```bash
-pi-agent
-```
- Type messages and press Enter to send
- Type `exit` or `quit` to end session
- Press Escape to interrupt while processing
- Press CTRL+C to clear the text editor
- Press CTRL+C twice quickly to exit
-
-### JSON Mode
-JSON mode enables programmatic integration by outputting events as JSONL (JSON Lines).
-
-**Single-shot mode:** Outputs a stream of JSON events for each message, then exits.
-```bash
-pi-agent --json "What is 2+2?" "And the meaning of life?"
-# Outputs: {"type":"session_start","sessionId":"bb6f0acb-80cf-4729-9593-bcf804431a53","model":"gpt-5-mini","api":"completions","baseURL":"https://api.openai.com/v1","systemPrompt":"You are a helpful assistant."} {"type":"user_message","text":"What is 2+2?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":314,"outputTokens":16,"totalTokens":330,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"2 + 2 = 4"} {"type":"user_message","text":"And the meaning of life?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":337,"outputTokens":331,"totalTokens":668,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"Short answer (pop-culture): 42.\n\nMore useful answers:\n- Philosophical...
-```
-
-**Interactive mode:** Accepts JSON commands via stdin and outputs JSON events to stdout.
-```bash
-# Start interactive JSON mode
-pi-agent --json
-# Now send commands via stdin
-
-# Pipe one or more initial messages in
-(echo '{"type": "message", "content": "What is 2+2?"}'; cat) | pi-agent --json
-# Outputs: {"type":"session_start","sessionId":"bb64cfbe-dd52-4662-bd4a-0d921c332fd1","model":"gpt-5-mini","api":"completions","baseURL":"https://api.openai.com/v1","systemPrompt":"You are a helpful assistant."} {"type":"user_message","text":"What is 2+2?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":314,"outputTokens":16,"totalTokens":330,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"2 + 2 = 4"}
-```
-
-Commands you can send via stdin in interactive JSON mode:
-```json
-{"type": "message", "content": "Your message here"}  // Send a message to the agent
-{"type": "interrupt"}                                 // Interrupt current processing
-```
-
-## Configuration
-
-### Command Line Options
-```
--base-url <url>        API base URL (default: https://api.openai.com/v1)
--api-key <key>         API key (or set OPENAI_API_KEY env var)
--model <model>         Model name (default: gpt-4o-mini)
--api <type>            API type: "completions" or "responses" (default: completions)
--system-prompt <text>  System prompt (default: "You are a helpful assistant.")
--continue              Continue previous session
--json                  JSON mode
--help, -h              Show help message
-```
-
-## Session Persistence
-
-Sessions are automatically saved to `~/.pi/sessions/` and include:
- Complete conversation history
- Tool call results
- Token usage statistics
-
-Use `--continue` to resume the last session:
-```bash
-pi-agent "Start a story about a robot"
-# ... later ...
-pi-agent --continue "Continue the story"
-```
-
-## Tools
-
-The agent includes built-in tools for file system operations:
- **read_file** - Read file contents
- **list_directory** - List directory contents
- **bash** - Execute shell commands
- **glob** - Find files by pattern
- **ripgrep** - Search file contents
-
-These tools are automatically available when using the agent through the `pi` command for code navigation tasks.
-
-## JSON Mode Events
-
-When using `--json`, the agent outputs these event types:
- `session_start` - New session started with metadata
- `user_message` - User input
- `assistant_start` - Assistant begins responding
- `assistant_message` - Assistant's response
- `reasoning` - Reasoning/thinking (for models that support it)
- `tool_call` - Tool being called
- `tool_result` - Result from tool
- `token_usage` - Token usage statistics (includes `reasoningTokens` for models with reasoning)
- `error` - Error occurred
- `interrupted` - Processing was interrupted
-
-The complete TypeScript type definition for `AgentEvent` can be found in [`src/agent.ts`](src/agent.ts#L6).
-
-## Build an Interactive UI with JSON Mode
-Build custom UIs in any language by spawning pi-agent in JSON mode and communicating via stdin/stdout.
-
-```javascript
-import { spawn } from 'child_process';
-import { createInterface } from 'readline';
-
-// Start the agent in JSON mode
-const agent = spawn('pi-agent', ['--json']);
-
-// Create readline interface for parsing JSONL output from agent
-const agentOutput = createInterface({input: agent.stdout, crlfDelay: Infinity});
-
-// Create readline interface for user input
-const userInput = createInterface({input: process.stdin, output: process.stdout});
-
-// State tracking
-let isProcessing = false, lastUsage, isExiting = false;
-
-// Handle each line of JSON output from agent
-agentOutput.on('line', (line) => {
-    try {
-      const event = JSON.parse(line);
-
-      // Handle all event types
-      switch (event.type) {
-        case 'session_start':
-          console.log(`Session started (${event.model}, ${event.api}, ${event.baseURL})`);
-          console.log('Press CTRL + C to exit');
-          promptUser();
-          break;
-
-        case 'user_message':
-          // Already shown in prompt, skip
-          break;
-
-        case 'assistant_start':
-          isProcessing = true;
-          console.log('\n[assistant]');
-          break;
-
-        case 'thinking':
-          console.log(`[thinking]\n${event.text}\n`);
-          break;
-
-        case 'tool_call':
-          console.log(`[tool] ${event.name}(${event.args.substring(0, 50)})\n`);
-          break;
-
-        case 'tool_result':
-            const lines = event.result.split('\n');
-            const truncated = lines.length - 5 > 0 ? `\n.  ... (${lines.length - 5} more lines truncated)` : '';
-            console.log(`[tool result]\n${lines.slice(0, 5).join('\n')}${truncated}\n`);
-          break;
-
-        case 'assistant_message':
-          console.log(event.text.trim());
-          isProcessing = false;
-          promptUser();
-          break;
-
-        case 'token_usage':
-          lastUsage = event;
-          break;
-
-        case 'error':
-          console.error('\n❌ Error:', event.message);
-          isProcessing = false;
-          promptUser();
-          break;
-
-        case 'interrupted':
-          console.log('\n⚠️  Interrupted by user');
-          isProcessing = false;
-          promptUser();
-          break;
-      }
-    } catch (e) {
-      console.error('Failed to parse JSON:', line, e);
-    }
-});
-
-// Send a message to the agent
-function sendMessage(content) {
-  agent.stdin.write(`${JSON.stringify({type: 'message', content: content})}\n`);
-}
-
-// Send interrupt signal
-function interrupt() {
-  agent.stdin.write(`${JSON.stringify({type: 'interrupt'})}\n`);
-}
-
-// Prompt for user input
-function promptUser() {
-  if (isExiting) return;
-
-  if (lastUsage) {
-    console.log(`\nin: ${lastUsage.inputTokens}, out: ${lastUsage.outputTokens}, cache read: ${lastUsage.cacheReadTokens}, cache write: ${lastUsage.cacheWriteTokens}`);
-  }
-
-  userInput.question('\n[user]\n> ', (answer) => {
-    answer = answer.trim();
-    if (answer) {
-      sendMessage(answer);
-    } else {
-      promptUser();
-    }
-  });
-}
-
-// Handle Ctrl+C
-process.on('SIGINT', () => {
-  if (isProcessing) {
-    interrupt();
-  } else {
-    agent.kill();
-    process.exit(0);
-  }
-});
-
-// Handle agent exit
-agent.on('close', (code) => {
-  isExiting = true;
-  userInput.close();
-  console.log(`\nAgent exited with code ${code}`);
-  process.exit(code);
-});
-
-// Handle errors
-agent.on('error', (err) => {
-  console.error('Failed to start agent:', err);
-  process.exit(1);
-});
-
-// Start the conversation
-console.log('Pi Agent Interactive Chat');
-```
-
-### Usage Examples
-
-```bash
-# OpenAI o1/o3 - see thinking content with Responses API
-pi-agent --api responses --model o1-mini "Explain quantum computing"
-
-# Groq gpt-oss - reasoning with Chat Completions
-pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY \
-  --model openai/gpt-oss-120b "Complex math problem"
-
-# Gemini 2.5 - thinking content automatically configured
-pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ \
-  --api-key $GEMINI_API_KEY --model gemini-2.5-flash "Think step by step"
-
-# OpenRouter - supports various reasoning models
-pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY \
-  --model "qwen/qwen3-235b-a22b-thinking-2507" "Complex reasoning task"
-```
-
-### JSON Mode Events
-
-When reasoning is active, you'll see:
- `reasoning` events with thinking text (when available)
- `token_usage` events include `reasoningTokens` field
- Console/TUI show reasoning tokens with ⚡ symbol
-
-## Reasoning
-
-Pi-agent supports reasoning/thinking tokens for models that provide this capability:
-
-### Supported Providers
-
-| Provider | Model | API | Thinking Content | Notes |
-|----------|-------|-----|------------------|-------|
-| **OpenAI** | o1, o3 | Responses | ✅ Full | Thinking events + token counts |
-| | o1, o3 | Chat Completions | ❌ | Token counts only |
-| | gpt-5 | Both APIs | ❌ | Model limitation (empty summaries) |
-| **Groq** | gpt-oss | Chat Completions | ✅ Full | Via `reasoning_format: "parsed"` |
-| | gpt-oss | Responses | ❌ | API doesn't support reasoning.summary |
-| **Gemini** | 2.5 models | Chat Completions | ✅ Full | Auto-configured via extra_body |
-| **Anthropic** | Claude | OpenAI Compat | ❌ | Use native API for thinking |
-| **OpenRouter** | Various | Both APIs | Varies | Depends on underlying model |
-
-
-### Technical Details
-
-The agent automatically:
- Detects provider from base URL
- Tests model reasoning support on first use (cached)
- Adjusts request parameters per provider:
-  - OpenAI: `reasoning_effort` (minimal/low)
-  - Groq: `reasoning_format: "parsed"`
-  - Gemini: `extra_body.google.thinking_config`
-  - OpenRouter: `reasoning` object with `effort` field
- Parses provider-specific response formats:
-  - Gemini: Extracts from `<thought>` tags
-  - Groq: Uses `message.reasoning` field
-  - OpenRouter: Uses `message.reasoning` field
-  - OpenAI: Uses standard `reasoning` events
-
-## Architecture
-
-The agent is built with:
- **agent.ts** - Core Agent class and API functions
- **cli.ts** - CLI entry point, argument parsing, and JSON mode handler
- **args.ts** - Custom typed argument parser
- **session-manager.ts** - Session persistence
- **tools/** - Tool implementations
- **renderers/** - Output formatters (console, TUI, JSON)
-
-## Development
-
-### Running from Source
-
-```bash
-# Run directly with npx tsx - no build needed
-npx tsx src/cli.ts "What is 2+2?"
-
-# Interactive TUI mode
-npx tsx src/cli.ts
-
-# JSON mode for programmatic use
-echo '{"type":"message","content":"list files"}' | npx tsx src/cli.ts --json
-```
-
-### Testing
-
-The agent supports three testing modes:
-
-#### 1. Test UI/Renderers (non-interactive mode)
-```bash
-# Test console renderer output and metrics
-npx tsx src/cli.ts "list files in /tmp" 2>&1 | tail -5
-# Verify: ↑609 ↓610 ⚒ 1 (tokens and tool count)
-
-# Test TUI renderer (with stdin)
-echo "list files" | npx tsx src/cli.ts 2>&1 | grep "⚒"
-```
-
-#### 2. Test Model Behavior (JSON mode)
-```bash
-# Extract metrics for model comparison
-echo '{"type":"message","content":"write fibonacci in Python"}' | \
-  npx tsx src/cli.ts --json --model gpt-4o-mini 2>&1 | \
-  jq -s '[.[] | select(.type=="token_usage")] | last'
-
-# Compare models: tokens used, tool calls made, quality
-for model in "gpt-4o-mini" "gpt-4o"; do
-  echo "Testing $model:"
-  echo '{"type":"message","content":"fix syntax errors in: prnt(hello)"}' | \
-    npx tsx src/cli.ts --json --model $model 2>&1 | \
-    jq -r 'select(.type=="token_usage" or .type=="tool_call" or .type=="assistant_message")'
-done
-```
-
-#### 3. LLM-as-Judge Testing
-```bash
-# Capture output from different models and evaluate with another LLM
-TASK="write a Python function to check if a number is prime"
-
-# Get response from model A
-RESPONSE_A=$(echo "{\"type\":\"message\",\"content\":\"$TASK\"}" | \
-  npx tsx src/cli.ts --json --model gpt-4o-mini 2>&1 | \
-  jq -r '.[] | select(.type=="assistant_message") | .text')
-
-# Judge the response
-echo "{\"type\":\"message\",\"content\":\"Rate this code (1-10): $RESPONSE_A\"}" | \
-  npx tsx src/cli.ts --json --model gpt-4o 2>&1 | \
-  jq -r '.[] | select(.type=="assistant_message") | .text'
-```
-
-## Use as a Library
-
-```typescript
-import { Agent, ConsoleRenderer } from '@mariozechner/pi-agent';
-
-const agent = new Agent({
-  apiKey: process.env.OPENAI_API_KEY,
-  baseURL: 'https://api.openai.com/v1',
-  model: 'gpt-5-mini',
-  api: 'completions',
-  systemPrompt: 'You are a helpful assistant.'
-}, new ConsoleRenderer());
-
-await agent.ask('What is 2+2?');
-```