co-mono/packages/agent
2025-09-09 15:00:32 +02:00
..
src fix: Remove unused imports and add biome-ignore for false positive 2025-08-16 19:21:43 +02:00
package-lock.json chore: bump version to 0.5.32 2025-09-09 15:00:32 +02:00
package.json chore: bump version to 0.5.32 2025-09-09 15:00:32 +02:00
README.md docs: Improve reasoning support table clarity 2025-08-10 02:13:13 +02:00
tsconfig.build.json Initial monorepo setup with npm workspaces and dual TypeScript configuration 2025-08-09 17:18:38 +02:00

pi-agent

A general-purpose agent with tool calling and session persistence, modeled after Claude Code but extremely hackable and minimal. It comes with a built-in TUI (also modeled after Claude Code) for interactive use.

Everything is designed to be easy:

  • Writing custom UIs on top of it (via JSON mode in any language or the TypeScript API)
  • Using it for inference steps in deterministic programs (via JSON mode in any language or the TypeScript API)
  • Providing your own system prompts and tools
  • Working with various LLM providers or self-hosted LLMs

Installation

npm install -g @mariozechner/pi-agent

This installs the pi-agent command globally.

Quick Start

By default, pi-agent uses OpenAI's API with model gpt-5-mini and authenticates using the OPENAI_API_KEY environment variable. Any OpenAI-compatible endpoint works, including Ollama, vLLM, OpenRouter, Groq, Anthropic, etc.

# Single message
pi-agent "What is 2+2?"

# Multiple messages processed sequentially
pi-agent "What is 2+2?" "What about 3+3?"

# Interactive chat mode (no messages = interactive)
pi-agent

# Continue most recently modified session in current directory
pi-agent --continue "Follow up question"

# GPT-OSS via Groq
pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY --model openai/gpt-oss-120b

# GLM 4.5 via OpenRouter
pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY --model z-ai/glm-4.5

# Claude via Anthropic's OpenAI compatibility layer. See: https://docs.anthropic.com/en/api/openai-sdk
pi-agent --base-url https://api.anthropic.com/v1 --api-key $ANTHROPIC_API_KEY --model claude-opus-4-1-20250805

# Gemini via Google AI
pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ --api-key $GEMINI_API_KEY --model gemini-2.5-flash

Usage Modes

Single-Shot Mode

Process one or more messages and exit:

pi-agent "First question" "Second question"

Interactive Mode

Start an interactive chat session:

pi-agent
  • Type messages and press Enter to send
  • Type exit or quit to end session
  • Press Escape to interrupt while processing
  • Press CTRL+C to clear the text editor
  • Press CTRL+C twice quickly to exit

JSON Mode

JSON mode enables programmatic integration by outputting events as JSONL (JSON Lines).

Single-shot mode: Outputs a stream of JSON events for each message, then exits.

pi-agent --json "What is 2+2?" "And the meaning of life?"
# Outputs: {"type":"session_start","sessionId":"bb6f0acb-80cf-4729-9593-bcf804431a53","model":"gpt-5-mini","api":"completions","baseURL":"https://api.openai.com/v1","systemPrompt":"You are a helpful assistant."} {"type":"user_message","text":"What is 2+2?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":314,"outputTokens":16,"totalTokens":330,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"2 + 2 = 4"} {"type":"user_message","text":"And the meaning of life?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":337,"outputTokens":331,"totalTokens":668,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"Short answer (pop-culture): 42.\n\nMore useful answers:\n- Philosophical...

Interactive mode: Accepts JSON commands via stdin and outputs JSON events to stdout.

# Start interactive JSON mode
pi-agent --json
# Now send commands via stdin

# Pipe one or more initial messages in
(echo '{"type": "message", "content": "What is 2+2?"}'; cat) | pi-agent --json
# Outputs: {"type":"session_start","sessionId":"bb64cfbe-dd52-4662-bd4a-0d921c332fd1","model":"gpt-5-mini","api":"completions","baseURL":"https://api.openai.com/v1","systemPrompt":"You are a helpful assistant."} {"type":"user_message","text":"What is 2+2?"} {"type":"assistant_start"} {"type":"token_usage","inputTokens":314,"outputTokens":16,"totalTokens":330,"cacheReadTokens":0,"cacheWriteTokens":0} {"type":"assistant_message","text":"2 + 2 = 4"}

Commands you can send via stdin in interactive JSON mode:

{"type": "message", "content": "Your message here"}  // Send a message to the agent
{"type": "interrupt"}                                 // Interrupt current processing

Configuration

Command Line Options

--base-url <url>        API base URL (default: https://api.openai.com/v1)
--api-key <key>         API key (or set OPENAI_API_KEY env var)
--model <model>         Model name (default: gpt-4o-mini)
--api <type>            API type: "completions" or "responses" (default: completions)
--system-prompt <text>  System prompt (default: "You are a helpful assistant.")
--continue              Continue previous session
--json                  JSON mode
--help, -h              Show help message

Session Persistence

Sessions are automatically saved to ~/.pi/sessions/ and include:

  • Complete conversation history
  • Tool call results
  • Token usage statistics

Use --continue to resume the last session:

pi-agent "Start a story about a robot"
# ... later ...
pi-agent --continue "Continue the story"

Tools

The agent includes built-in tools for file system operations:

  • read_file - Read file contents
  • list_directory - List directory contents
  • bash - Execute shell commands
  • glob - Find files by pattern
  • ripgrep - Search file contents

These tools are automatically available when using the agent through the pi command for code navigation tasks.

JSON Mode Events

When using --json, the agent outputs these event types:

  • session_start - New session started with metadata
  • user_message - User input
  • assistant_start - Assistant begins responding
  • assistant_message - Assistant's response
  • reasoning - Reasoning/thinking (for models that support it)
  • tool_call - Tool being called
  • tool_result - Result from tool
  • token_usage - Token usage statistics (includes reasoningTokens for models with reasoning)
  • error - Error occurred
  • interrupted - Processing was interrupted

The complete TypeScript type definition for AgentEvent can be found in src/agent.ts.

Build an Interactive UI with JSON Mode

Build custom UIs in any language by spawning pi-agent in JSON mode and communicating via stdin/stdout.

import { spawn } from 'child_process';
import { createInterface } from 'readline';

// Start the agent in JSON mode
const agent = spawn('pi-agent', ['--json']);

// Create readline interface for parsing JSONL output from agent
const agentOutput = createInterface({input: agent.stdout, crlfDelay: Infinity});

// Create readline interface for user input
const userInput = createInterface({input: process.stdin, output: process.stdout});

// State tracking
let isProcessing = false, lastUsage, isExiting = false;

// Handle each line of JSON output from agent
agentOutput.on('line', (line) => {
    try {
      const event = JSON.parse(line);

      // Handle all event types
      switch (event.type) {
        case 'session_start':
          console.log(`Session started (${event.model}, ${event.api}, ${event.baseURL})`);
          console.log('Press CTRL + C to exit');
          promptUser();
          break;

        case 'user_message':
          // Already shown in prompt, skip
          break;

        case 'assistant_start':
          isProcessing = true;
          console.log('\n[assistant]');
          break;

        case 'thinking':
          console.log(`[thinking]\n${event.text}\n`);
          break;

        case 'tool_call':
          console.log(`[tool] ${event.name}(${event.args.substring(0, 50)})\n`);
          break;

        case 'tool_result':
            const lines = event.result.split('\n');
            const truncated = lines.length - 5 > 0 ? `\n.  ... (${lines.length - 5} more lines truncated)` : '';
            console.log(`[tool result]\n${lines.slice(0, 5).join('\n')}${truncated}\n`);
          break;

        case 'assistant_message':
          console.log(event.text.trim());
          isProcessing = false;
          promptUser();
          break;

        case 'token_usage':
          lastUsage = event;
          break;

        case 'error':
          console.error('\n❌ Error:', event.message);
          isProcessing = false;
          promptUser();
          break;

        case 'interrupted':
          console.log('\n⚠  Interrupted by user');
          isProcessing = false;
          promptUser();
          break;
      }
    } catch (e) {
      console.error('Failed to parse JSON:', line, e);
    }
});

// Send a message to the agent
function sendMessage(content) {
  agent.stdin.write(`${JSON.stringify({type: 'message', content: content})}\n`);
}

// Send interrupt signal
function interrupt() {
  agent.stdin.write(`${JSON.stringify({type: 'interrupt'})}\n`);
}

// Prompt for user input
function promptUser() {
  if (isExiting) return;

  if (lastUsage) {
    console.log(`\nin: ${lastUsage.inputTokens}, out: ${lastUsage.outputTokens}, cache read: ${lastUsage.cacheReadTokens}, cache write: ${lastUsage.cacheWriteTokens}`);
  }

  userInput.question('\n[user]\n> ', (answer) => {
    answer = answer.trim();
    if (answer) {
      sendMessage(answer);
    } else {
      promptUser();
    }
  });
}

// Handle Ctrl+C
process.on('SIGINT', () => {
  if (isProcessing) {
    interrupt();
  } else {
    agent.kill();
    process.exit(0);
  }
});

// Handle agent exit
agent.on('close', (code) => {
  isExiting = true;
  userInput.close();
  console.log(`\nAgent exited with code ${code}`);
  process.exit(code);
});

// Handle errors
agent.on('error', (err) => {
  console.error('Failed to start agent:', err);
  process.exit(1);
});

// Start the conversation
console.log('Pi Agent Interactive Chat');

Usage Examples

# OpenAI o1/o3 - see thinking content with Responses API
pi-agent --api responses --model o1-mini "Explain quantum computing"

# Groq gpt-oss - reasoning with Chat Completions
pi-agent --base-url https://api.groq.com/openai/v1 --api-key $GROQ_API_KEY \
  --model openai/gpt-oss-120b "Complex math problem"

# Gemini 2.5 - thinking content automatically configured
pi-agent --base-url https://generativelanguage.googleapis.com/v1beta/openai/ \
  --api-key $GEMINI_API_KEY --model gemini-2.5-flash "Think step by step"

# OpenRouter - supports various reasoning models
pi-agent --base-url https://openrouter.ai/api/v1 --api-key $OPENROUTER_API_KEY \
  --model "qwen/qwen3-235b-a22b-thinking-2507" "Complex reasoning task"

JSON Mode Events

When reasoning is active, you'll see:

  • reasoning events with thinking text (when available)
  • token_usage events include reasoningTokens field
  • Console/TUI show reasoning tokens with symbol

Reasoning

Pi-agent supports reasoning/thinking tokens for models that provide this capability:

Supported Providers

Provider Model API Thinking Content Notes
OpenAI o1, o3 Responses Full Thinking events + token counts
o1, o3 Chat Completions Token counts only
gpt-5 Both APIs Model limitation (empty summaries)
Groq gpt-oss Chat Completions Full Via reasoning_format: "parsed"
gpt-oss Responses API doesn't support reasoning.summary
Gemini 2.5 models Chat Completions Full Auto-configured via extra_body
Anthropic Claude OpenAI Compat Use native API for thinking
OpenRouter Various Both APIs Varies Depends on underlying model

Technical Details

The agent automatically:

  • Detects provider from base URL
  • Tests model reasoning support on first use (cached)
  • Adjusts request parameters per provider:
    • OpenAI: reasoning_effort (minimal/low)
    • Groq: reasoning_format: "parsed"
    • Gemini: extra_body.google.thinking_config
    • OpenRouter: reasoning object with effort field
  • Parses provider-specific response formats:
    • Gemini: Extracts from <thought> tags
    • Groq: Uses message.reasoning field
    • OpenRouter: Uses message.reasoning field
    • OpenAI: Uses standard reasoning events

Architecture

The agent is built with:

  • agent.ts - Core Agent class and API functions
  • cli.ts - CLI entry point, argument parsing, and JSON mode handler
  • args.ts - Custom typed argument parser
  • session-manager.ts - Session persistence
  • tools/ - Tool implementations
  • renderers/ - Output formatters (console, TUI, JSON)

Development

Running from Source

# Run directly with npx tsx - no build needed
npx tsx src/cli.ts "What is 2+2?"

# Interactive TUI mode
npx tsx src/cli.ts

# JSON mode for programmatic use
echo '{"type":"message","content":"list files"}' | npx tsx src/cli.ts --json

Testing

The agent supports three testing modes:

1. Test UI/Renderers (non-interactive mode)

# Test console renderer output and metrics
npx tsx src/cli.ts "list files in /tmp" 2>&1 | tail -5
# Verify: ↑609 ↓610 ⚒ 1 (tokens and tool count)

# Test TUI renderer (with stdin)
echo "list files" | npx tsx src/cli.ts 2>&1 | grep "⚒"

2. Test Model Behavior (JSON mode)

# Extract metrics for model comparison
echo '{"type":"message","content":"write fibonacci in Python"}' | \
  npx tsx src/cli.ts --json --model gpt-4o-mini 2>&1 | \
  jq -s '[.[] | select(.type=="token_usage")] | last'

# Compare models: tokens used, tool calls made, quality
for model in "gpt-4o-mini" "gpt-4o"; do
  echo "Testing $model:"
  echo '{"type":"message","content":"fix syntax errors in: prnt(hello)"}' | \
    npx tsx src/cli.ts --json --model $model 2>&1 | \
    jq -r 'select(.type=="token_usage" or .type=="tool_call" or .type=="assistant_message")'
done

3. LLM-as-Judge Testing

# Capture output from different models and evaluate with another LLM
TASK="write a Python function to check if a number is prime"

# Get response from model A
RESPONSE_A=$(echo "{\"type\":\"message\",\"content\":\"$TASK\"}" | \
  npx tsx src/cli.ts --json --model gpt-4o-mini 2>&1 | \
  jq -r '.[] | select(.type=="assistant_message") | .text')

# Judge the response
echo "{\"type\":\"message\",\"content\":\"Rate this code (1-10): $RESPONSE_A\"}" | \
  npx tsx src/cli.ts --json --model gpt-4o 2>&1 | \
  jq -r '.[] | select(.type=="assistant_message") | .text'

Use as a Library

import { Agent, ConsoleRenderer } from '@mariozechner/pi-agent';

const agent = new Agent({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: 'https://api.openai.com/v1',
  model: 'gpt-5-mini',
  api: 'completions',
  systemPrompt: 'You are a helpful assistant.'
}, new ConsoleRenderer());

await agent.ask('What is 2+2?');