mirror of https://github.com/getcompanion-ai/co-mono.git synced 2026-04-15 20:03:05 +00:00

Hew Li Yang d5e0cb4630 wip: add /resume slash command

2025-12-05 18:49:24 +08:00

46 KiB

Raw Blame History

pi

A radically simple and opinionated coding agent with multi-model support (including mid-session switching), a simple yet powerful CLI for headless coding tasks, and many creature comforts you might be used to from other coding agents.

Works on Linux, macOS, and Windows (barely tested, needs Git Bash running in the "modern" Windows Terminal).

Installation
Quick Start
API Keys
OAuth Authentication (Optional)
Custom Models and Providers
Themes
Slash Commands
Editor Features
Project Context Files
Image Support
Session Management
Context Compaction
CLI Options
Tools
Usage
Security (YOLO by default)
Sub-Agents
To-Dos
Planning
Background Bash
License
See Also

Installation

npm (recommended)

npm install -g @mariozechner/pi-coding-agent

Standalone Binary

Pre-built binaries are available on the GitHub Releases page. Download the archive for your platform:

pi-darwin-arm64.tar.gz - macOS Apple Silicon
pi-darwin-x64.tar.gz - macOS Intel
pi-linux-x64.tar.gz - Linux x64
pi-linux-arm64.tar.gz - Linux ARM64
pi-windows-x64.zip - Windows x64

Extract and run:

# macOS/Linux
tar -xzf pi-darwin-arm64.tar.gz
./pi

# Windows
unzip pi-windows-x64.zip
pi.exe

The archive includes the binary plus supporting files (README, CHANGELOG, themes). Keep them together in the same directory.

macOS users: The binary is not signed. macOS may block it on first run. To fix:

xattr -c ./pi

Build Binary from Source

Requires Bun 1.0+:

git clone https://github.com/badlogic/pi-mono.git
cd pi-mono
npm install
cd packages/coding-agent
npm run build:binary

# Binary and supporting files are in dist/
./dist/pi

Quick Start

# Set your API key (see API Keys section)
export ANTHROPIC_API_KEY=sk-ant-...

# Start the interactive CLI
pi

Once in the CLI, you can chat with the AI:

You: Create a simple Express server in src/server.ts

The agent will use its tools to read, write, and edit files as needed, and execute commands via Bash.

API Keys

The CLI supports multiple LLM providers. Set the appropriate environment variable for your chosen provider:

# Anthropic (Claude)
export ANTHROPIC_API_KEY=sk-ant-...
# Or use OAuth token (retrieved via: claude setup-token)
export ANTHROPIC_OAUTH_TOKEN=...

# OpenAI (GPT)
export OPENAI_API_KEY=sk-...

# Google (Gemini)
export GEMINI_API_KEY=...

# Groq
export GROQ_API_KEY=gsk_...

# Cerebras
export CEREBRAS_API_KEY=csk-...

# xAI (Grok)
export XAI_API_KEY=xai-...

# OpenRouter
export OPENROUTER_API_KEY=sk-or-...

# ZAI
export ZAI_API_KEY=...

If no API key is set, the CLI will prompt you to configure one on first run.

Note: The /model command only shows models for which API keys are configured in your environment. If you don't see a model you expect, check that you've set the corresponding environment variable.

OAuth Authentication (Optional)

If you have a Claude Pro/Max subscription, you can use OAuth instead of API keys:

pi
# In the interactive session:
/login
# Select "Anthropic (Claude Pro/Max)"
# Authorize in browser
# Paste authorization code

This gives you:

Free access to Claude models (included in your subscription)
No need to manage API keys
Automatic token refresh

To logout:

/logout

Note: OAuth tokens are stored in ~/.pi/agent/oauth.json with restricted permissions (0600).

Custom Models and Providers

You can add custom models and providers (like Ollama, vLLM, LM Studio, or any custom API endpoint) via ~/.pi/agent/models.json. Supports OpenAI-compatible APIs (openai-completions, openai-responses), Anthropic Messages API (anthropic-messages), and Google Generative AI API (google-generative-ai). This file is loaded fresh every time you open the /model selector, allowing live updates without restarting.

Configuration File Structure

{
  "providers": {
    "ollama": {
      "baseUrl": "http://localhost:11434/v1",
      "apiKey": "OLLAMA_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "llama-3.1-8b",
          "name": "Llama 3.1 8B (Local)",
          "reasoning": false,
          "input": ["text"],
          "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0},
          "contextWindow": 128000,
          "maxTokens": 32000
        }
      ]
    },
    "vllm": {
      "baseUrl": "http://your-server:8000/v1",
      "apiKey": "VLLM_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "custom-model",
          "name": "Custom Fine-tuned Model",
          "reasoning": false,
          "input": ["text", "image"],
          "cost": {"input": 0.5, "output": 1.0, "cacheRead": 0, "cacheWrite": 0},
          "contextWindow": 32768,
          "maxTokens": 8192
        }
      ]
    },
    "mixed-api-provider": {
      "baseUrl": "https://api.example.com/v1",
      "apiKey": "CUSTOM_API_KEY",
      "api": "openai-completions",
      "models": [
        {
          "id": "legacy-model",
          "name": "Legacy Model",
          "reasoning": false,
          "input": ["text"],
          "cost": {"input": 1.0, "output": 2.0, "cacheRead": 0, "cacheWrite": 0},
          "contextWindow": 8192,
          "maxTokens": 4096
        },
        {
          "id": "new-model",
          "name": "New Model",
          "api": "openai-responses",
          "reasoning": true,
          "input": ["text", "image"],
          "cost": {"input": 0.5, "output": 1.0, "cacheRead": 0.1, "cacheWrite": 0.2},
          "contextWindow": 128000,
          "maxTokens": 32000
        }
      ]
    }
  }
}

API Key Resolution

The apiKey field can be either an environment variable name or a literal API key:

First, pi checks if an environment variable with that name exists
If found, uses the environment variable's value
Otherwise, treats it as a literal API key

Examples:

"apiKey": "OLLAMA_API_KEY" → checks $OLLAMA_API_KEY, then treats as literal "OLLAMA_API_KEY"
"apiKey": "sk-1234..." → checks $sk-1234... (unlikely to exist), then uses literal value

This allows both secure env var usage and literal keys for local servers.

API Override

Provider-level api: Sets the default API for all models in that provider
Model-level api: Overrides the provider default for specific models
Supported APIs: openai-completions, openai-responses, anthropic-messages, google-generative-ai

This is useful when a provider supports multiple API standards through the same base URL.

Custom Headers

You can add custom HTTP headers to bypass Cloudflare bot detection, add authentication tokens, or meet other proxy requirements:

{
  "providers": {
    "custom-proxy": {
      "baseUrl": "https://proxy.example.com/v1",
      "apiKey": "YOUR_API_KEY",
      "api": "anthropic-messages",
      "headers": {
        "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36",
        "X-Custom-Auth": "bearer-token-here"
      },
      "models": [
        {
          "id": "claude-sonnet-4",
          "name": "Claude Sonnet 4 (Proxied)",
          "reasoning": true,
          "input": ["text", "image"],
          "cost": {"input": 3, "output": 15, "cacheRead": 0.3, "cacheWrite": 3.75},
          "contextWindow": 200000,
          "maxTokens": 8192,
          "headers": {
            "X-Model-Specific-Header": "value"
          }
        }
      ]
    }
  }
}

Provider-level headers: Applied to all requests for models in that provider
Model-level headers: Additional headers for specific models (merged with provider headers)
Model headers override provider headers when keys conflict

Authorization Header

Some providers require an explicit Authorization: Bearer <token> header. Set authHeader: true to automatically add this header using the resolved apiKey:

{
  "providers": {
    "qwen": {
      "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
      "apiKey": "QWEN_API_KEY",
      "authHeader": true,
      "api": "openai-completions",
      "models": [
        {
          "id": "qwen3-coder-plus",
          "name": "Qwen3 Coder Plus",
          "reasoning": true,
          "input": ["text"],
          "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0},
          "contextWindow": 1000000,
          "maxTokens": 65536
        }
      ]
    }
  }
}

When authHeader: true, the resolved API key is added as Authorization: Bearer <apiKey> to the model headers. This is useful for providers that don't use the standard OpenAI authentication mechanism.

Model Selection Priority

When starting pi, models are selected in this order:

CLI args: --provider and --model flags
First from --models scope: If --models is provided (skipped when using --continue or --resume)
Restored from session: If using --continue or --resume
Saved default: From ~/.pi/agent/settings.json (set when you select a model with /model)
First available: First model with a valid API key
None: Allowed in interactive mode (shows error on message submission)

Provider Defaults

When multiple providers are available, pi prefers sensible defaults before falling back to "first available":

Provider	Default Model
anthropic	claude-sonnet-4-5
openai	gpt-5.1-codex
google	gemini-2.5-pro
openrouter	openai/gpt-5.1-codex
xai	grok-4-fast-non-reasoning
groq	openai/gpt-oss-120b
cerebras	zai-glm-4.6
zai	glm-4.6

Live Reload & Errors

The models.json file is reloaded every time you open the /model selector. This means:

Edit models.json during a session
Or have the agent write/update it for you
Use /model to see changes immediately
No restart needed!

If the file contains errors (JSON syntax, schema violations, missing fields), the selector shows the exact validation error and file path in red so you can fix it immediately.

Example: Adding Ollama Models

See the configuration structure above. Create ~/.pi/agent/models.json with your Ollama setup, then use /model to select your local models. The agent can also help you write this file if you point it to this README.

Themes

Pi supports customizable color themes for the TUI. Two built-in themes are available: dark (default) and light.

Selecting a Theme

Use the /theme command to interactively select a theme, or edit your settings file:

# Interactive selector
pi
/theme

# Or edit ~/.pi/agent/settings.json
{
  "theme": "dark"  # or "light"
}

On first run, Pi auto-detects your terminal background (dark/light) and selects an appropriate theme.

Custom Themes

Create custom themes in ~/.pi/agent/themes/*.json. Custom themes support live editing - when you select a custom theme, Pi watches the file and automatically reloads when you save changes.

Workflow for creating themes:

Copy a built-in theme as a starting point:

mkdir -p ~/.pi/agent/themes
# Copy dark theme
cp $(npm root -g)/@mariozechner/pi-coding-agent/dist/theme/dark.json ~/.pi/agent/themes/my-theme.json
# Or copy light theme
cp $(npm root -g)/@mariozechner/pi-coding-agent/dist/theme/light.json ~/.pi/agent/themes/my-theme.json

Use /theme to select "my-theme"
Edit ~/.pi/agent/themes/my-theme.json - changes apply immediately on save
Iterate until satisfied (no need to re-select the theme)

See Theme Documentation for:

Complete list of 44 color tokens
Theme format and examples
Color value formats (hex, RGB, terminal default)

Example custom theme:

{
  "$schema": "https://raw.githubusercontent.com/badlogic/pi-mono/main/packages/coding-agent/theme-schema.json",
  "name": "my-theme",
  "vars": {
    "accent": "#00aaff",
    "muted": "#6c6c6c"
  },
  "colors": {
    "accent": "accent",
    "muted": "muted",
    ...
  }
}

VS Code Terminal Color Issue

Important: VS Code's integrated terminal has a known issue with rendering truecolor (24-bit RGB) values. By default, it applies a "minimum contrast ratio" adjustment that can make colors look washed out or identical.

To fix this, set the contrast ratio to 1 in VS Code settings:

Open Settings (Cmd/Ctrl + ,)
Search for: terminal.integrated.minimumContrastRatio
Set to: 1

This ensures VS Code renders the exact RGB colors defined in your theme.

Slash Commands

The CLI supports several commands to control its behavior:

/model

Switch models mid-session. Opens an interactive selector where you can type to search (by provider or model name), use arrow keys to navigate, Enter to select, or Escape to cancel.

The selector only displays models for which API keys are configured in your environment (see API Keys section).

/thinking

Adjust thinking/reasoning level for supported models (Claude Sonnet 4, GPT-5, Gemini 2.5). Opens an interactive selector where you can use arrow keys to navigate, Enter to select, or Escape to cancel.

/queue

Select message queue mode. Opens an interactive selector where you can choose between:

one-at-a-time (default): Process queued messages one by one. When you submit messages while the agent is processing, they're queued and sent individually after each agent response completes.
all: Process all queued messages at once. All queued messages are injected into the context together before the next agent response.

The queue mode setting is saved and persists across sessions.

/export [filename]

Export the current session to a self-contained HTML file:

/export                          # Auto-generates filename
/export my-session.html          # Custom filename

The HTML file includes the full conversation with syntax highlighting and is viewable in any browser.

/session

Show session information and statistics:

/session

Displays:

Session file path and ID
Message counts (user, assistant, total)
Token usage (input, output, cache read/write, total)
Total cost (if available)

/changelog

Display the full changelog with all version history (newest last):

/changelog

/branch

Create a new conversation branch from a previous message. Opens an interactive selector showing all your user messages in chronological order. Select a message to:

Create a new session with all messages before the selected one
Place the selected message in the editor for modification or resubmission

This allows you to explore alternative conversation paths without losing your current session.

/branch

/resume

Switch to a different session. Opens an interactive selector showing all available sessions. Select a session to load it and continue where you left off.

This is equivalent to the --resume CLI flag but can be used mid-session.

/resume

/login

Opens an interactive selector to choose provider, then guides you through the OAuth flow in your browser.

/logout

Logout from OAuth providers:

/logout

Shows a list of logged-in providers to logout from.

/clear

Clear the conversation context and start a fresh session:

/clear

Aborts any in-flight agent work, clears all messages, and creates a new session file.

/copy

Copy the last agent message to clipboard:

/copy

Extracts text content from the most recent assistant message and copies it to the system clipboard. Works cross-platform (macOS, Windows, Linux). On Linux, requires xclip or xsel to be installed.

/compact

Manually compact the conversation context to reduce token usage:

/compact                           # Use default summary instructions
/compact Focus on the API changes  # Custom instructions for summary

Creates a summary of the conversation so far, replacing the message history with a condensed version. See Context Compaction for details.

/autocompact

Toggle automatic context compaction:

/autocompact

When enabled, the agent automatically compacts context when usage exceeds the configured threshold. The current state (enabled/disabled) is shown after toggling. See Context Compaction for details.

Custom Slash Commands

Define reusable prompt templates as Markdown files that appear in the / autocomplete.

Locations:

Global: ~/.pi/agent/commands/*.md - available in all sessions
Project: .pi/commands/*.md - project-specific commands

File format:

---
description: Review staged git changes
---
Review the staged changes (`git diff --cached`). Focus on:
- Bugs and logic errors
- Security issues
- Error handling gaps
- Code style per AGENTS.md

The filename (without .md) becomes the command name. The optional description frontmatter field is shown in autocomplete. If omitted, the first line of content is used.

Arguments (bash-style):

Commands support positional arguments with quote-aware parsing:

---
description: Create a component with features
---
Create a React component named $1 with these features: $@

Usage: /component Button "has onClick handler" "supports disabled"

$1 = Button
$2 = has onClick handler
$@ = Button has onClick handler supports disabled

Namespacing:

Subdirectories create namespaced commands. A file at .pi/commands/frontend/component.md creates /component with description showing (project:frontend).

Source indicators:

Commands show their source in autocomplete:

(user) - from ~/.pi/agent/commands/
(project) - from .pi/commands/
(project:subdir) - from .pi/commands/subdir/

CLI usage:

Custom slash commands also work from the command line:

# Non-interactive mode
pi -p "/review"

# With arguments
pi -p '/component Button "handles click events"'

# Interactive mode with initial command
pi "/review"

Editor Features

The interactive input editor includes several productivity features:

File Reference (`@`)

Type @ to fuzzy-search for files and folders in your project:

@editor → finds files/folders with "editor" in the name
@readme → finds README files anywhere in the project
@src → finds folders like src/, resources/, etc.
Directories are prioritized and shown with trailing /
Autocomplete triggers immediately when you type @
Use Up/Down arrows to navigate, Tab/Enter to select

Respects .gitignore files and skips hidden files/directories.

Path Completion

Press Tab to autocomplete file and directory paths:

Works with relative paths: ./src/ + Tab → complete files in src/
Works with parent directories: ../../ + Tab → navigate up and complete
Works with home directory: ~/Des + Tab → ~/Desktop/
Use Up/Down arrows to navigate completion suggestions
Press Enter to select a completion
Shows matching files and directories as you type

File Drag & Drop

Drag files from your OS file explorer (Finder on macOS, Explorer on Windows) directly onto the terminal. The file path will be automatically inserted into the editor. Works great with screenshots from macOS screenshot tool.

Multi-line Paste

Paste multiple lines of text (e.g., code snippets, logs) and they'll be automatically coalesced into a compact [paste #123 <N> lines] reference in the editor. The full content is still sent to the model.

Message Queuing

You can submit multiple messages while the agent is processing without waiting for responses. Messages are queued and processed based on your queue mode setting:

One-at-a-time mode (default):

Each queued message is processed sequentially with its own response
Example: Queue "task 1", "task 2", "task 3" → agent completes task 1 → processes task 2 → completes task 2 → processes task 3
Recommended for most use cases

All mode:

All queued messages are sent to the model at once in a single context
Example: Queue "task 1", "task 2", "task 3" → agent receives all three together → responds considering all tasks
Useful when tasks should be considered together

Visual feedback:

Queued messages appear below the chat with "Queued: "
Messages disappear from the queue as they're processed

Abort and restore:

Press Escape while streaming to abort the current operation
All queued messages (plus any text in the editor) are restored to the editor
Allows you to modify or remove queued messages before resubmitting

Change queue mode with /queue command. Setting is saved in ~/.pi/agent/settings.json.

Keyboard Shortcuts

Navigation:

Arrow keys: Move cursor (Up/Down navigate visual lines, Left/Right move by character)
Option+Left / Ctrl+Left: Move word backwards
Option+Right / Ctrl+Right: Move word forwards
Ctrl+A / Home: Jump to start of line
Ctrl+E / End: Jump to end of line

Editing:

Enter: Send message
Shift+Enter / Alt+Enter: Insert new line (multi-line input). On WSL, use Ctrl+Enter instead.
Backspace: Delete character backwards
Delete (or Fn+Backspace): Delete character forwards
Ctrl+W / Option+Backspace: Delete word backwards (stops at whitespace or punctuation)
Ctrl+U: Delete to start of line (at line start: merge with previous line)
Ctrl+K: Delete to end of line (at line end: merge with next line)

Completion:

Tab: Path completion / Apply autocomplete selection
Escape: Cancel autocomplete (when autocomplete is active)

Other:

Ctrl+C: Clear editor (first press) / Exit pi (second press)
Shift+Tab: Cycle thinking level (for reasoning-capable models)
Ctrl+P: Cycle models (use --models to scope)
Ctrl+O: Toggle tool output expansion (collapsed ↔ full output)
Ctrl+T: Toggle thinking block visibility (shows full content ↔ static "Thinking..." label)

Project Context Files

The agent automatically loads context from AGENTS.md or CLAUDE.md files at startup. These files are loaded in hierarchical order to support both global preferences and monorepo structures.

File Locations

Context files are loaded in this order:

Global context: ~/.pi/agent/AGENTS.md or CLAUDE.md
- Applies to all your coding sessions
- Great for personal coding preferences and workflows
Parent directories (top-most first down to current directory)
- Walks up from current directory to filesystem root
- Each directory can have its own AGENTS.md or CLAUDE.md
- Perfect for monorepos with shared context at higher levels
Current directory: Your project's AGENTS.md or CLAUDE.md
- Most specific context, loaded last
- Overwrites or extends parent/global context

File preference: In each directory, AGENTS.md is preferred over CLAUDE.md if both exist.

What to Include

Context files are useful for:

Project-specific instructions and guidelines
Common bash commands and workflows
Architecture documentation
Coding conventions and style guides
Dependencies and setup information
Testing instructions
Repository etiquette (branch naming, merge vs. rebase, etc.)

Example

# Common Commands
- npm run build: Build the project
- npm test: Run tests

# Code Style
- Use TypeScript strict mode
- Prefer async/await over promises

# Workflow
- Always run tests before committing
- Update CHANGELOG.md for user-facing changes

All context files are automatically included in the system prompt at session start, along with the current date/time and working directory. This ensures the AI has complete project context from the very first message.

Image Support

Send images to vision-capable models by providing file paths:

You: What is in this screenshot? /path/to/image.png

Supported formats: .jpg, .jpeg, .png, .gif, .webp

The image will be automatically encoded and sent with your message. JPEG and PNG are supported across all vision models. Other formats may only be supported by some models.

Session Management

Sessions are automatically saved in ~/.pi/agent/sessions/ organized by working directory. Each session is stored as a JSONL file with a unique timestamp-based ID.

To continue the most recent session:

pi --continue
# or
pi -c

To browse and select from past sessions:

pi --resume
# or
pi -r

This opens an interactive session selector where you can:

Type to search through session messages
Use arrow keys to navigate the list
Press Enter to resume a session
Press Escape to cancel

Sessions include all conversation messages, tool calls and results, model switches, and thinking level changes.

To run without saving a session (ephemeral mode):

pi --no-session

To use a specific session file instead of auto-generating one:

pi --session /path/to/my-session.jsonl

Context Compaction

Note: Compaction is lossy and should generally be avoided. The agent loses access to the full conversation after compaction. Size your tasks to avoid hitting context limits. Alternatively, when context usage approaches 85-90%, ask the agent to write a summary to a markdown file, iterate until it captures everything important, then start a new session with that file.

That said, compaction does not destroy history. The full session is preserved in the session file with compaction events as markers. You can branch (/branch) from any previous message, and branched sessions include the complete history. If compaction missed something, you can ask the agent to read the session file directly.

Long sessions can exhaust the model's context window. Context compaction summarizes older conversation history while preserving recent messages, allowing sessions to continue indefinitely.

How It Works

When compaction runs (manually via /compact or automatically):

A cut point is calculated to keep approximately keepRecentTokens (default: 20k) worth of recent messages
Messages before the cut point are sent to the model for summarization
Messages after the cut point are kept verbatim
The summary replaces the older messages as a "context handoff" message
If there was a previous compaction, its summary is included as context for the new summary (chaining)

Cut points are always placed at user message boundaries to preserve turn integrity.

The summary is displayed in the TUI as a collapsible block (toggle with o key). HTML exports also show compaction summaries as collapsible sections.

Manual Compaction

Use /compact to manually trigger compaction at any time:

/compact                           # Default summary
/compact Focus on the API changes  # Custom instructions guide what to emphasize

Custom instructions are appended to the default summary prompt, letting you focus the summary on specific aspects of the conversation.

Automatic Compaction

Enable auto-compaction with /autocompact. When enabled, compaction triggers automatically when context usage exceeds contextWindow - reserveTokens.

The context percentage is shown in the footer. When it approaches 100%, auto-compaction kicks in (if enabled) or you should manually compact.

Configuration

Power users can tune compaction behavior in ~/.pi/agent/settings.json:

{
  "compaction": {
    "enabled": true,
    "reserveTokens": 16384,
    "keepRecentTokens": 20000
  }
}

enabled: Whether auto-compaction is active (toggle with /autocompact)
reserveTokens: Token buffer to keep free (default: 16,384). Auto-compaction triggers when contextTokens > contextWindow - reserveTokens
keepRecentTokens: How many tokens worth of recent messages to preserve verbatim (default: 20,000). Older messages are summarized.

Supported Modes

Context compaction works in both interactive and RPC modes:

Interactive: Use /compact and /autocompact commands
RPC: Send {"type":"compact"} for manual compaction. Auto-compaction emits {"type":"compaction","auto":true} events. See RPC documentation for details.

CLI Options

pi [options] [@files...] [messages...]

File Arguments (`@file`)

You can include files directly in your initial message using the @ prefix:

# Include a text file in your prompt
pi @prompt.md "Answer the question"

# Include multiple files
pi @requirements.md @context.txt "Summarize these"

# Include images (vision-capable models only)
pi @screenshot.png "What's in this image?"

# Mix text and images
pi @prompt.md @diagram.png "Explain based on the diagram"

# Files without additional text
pi @task.md

How it works:

All @file arguments are combined into the first user message
Text files are wrapped in <file name="path">content</file> tags
Images (.jpg, .jpeg, .png, .gif, .webp) are attached as base64-encoded attachments
Paths support ~ for home directory and relative/absolute paths
Empty files are skipped
Non-existent files cause an immediate error

Examples:

# All files go into first message, regardless of position
pi @file1.md @file2.txt "prompt" @file3.md

# This sends:
# Message 1: file1 + file2 + file3 + "prompt"
# (Any additional plain text arguments become separate messages)

# Home directory expansion works
pi @~/Documents/notes.md "Summarize"

# Combine with other options
pi --print @requirements.md "List the main points"

Limitations:

Not supported in --mode rpc (will error)
Images require vision-capable models (e.g., Claude, GPT-4o, Gemini)

Options

--provider Provider name. Available: anthropic, openai, google, xai, groq, cerebras, openrouter, zai, plus any custom providers defined in ~/.pi/agent/models.json.

--model Model ID. If not specified, uses: (1) saved default from settings, (2) first available model with valid API key, or (3) none (interactive mode only).

--api-key API key (overrides environment variables)

--system-prompt <text|file> Custom system prompt. Can be:

Inline text: --system-prompt "You are a helpful assistant"
File path: --system-prompt ./my-prompt.txt

If the argument is a valid file path, the file contents will be used as the system prompt. Otherwise, the text is used directly. Project context files and datetime are automatically appended.

--append-system-prompt <text|file> Append additional text or file contents to the system prompt. Can be:

Inline text: --append-system-prompt "Also consider edge cases"
File path: --append-system-prompt ./extra-instructions.txt

If the argument is a valid file path, the file contents will be appended. Otherwise, the text is appended directly. This complements --system-prompt for layering custom instructions without replacing the base system prompt. Works in both custom and default system prompts.

--mode Output mode for non-interactive usage (implies --print). Options:

text (default): Output only the final assistant message text
json: Stream all agent events as JSON (one event per line). Events are emitted by @mariozechner/pi-agent and include message updates, tool executions, and completions
rpc: JSON mode plus stdin listener for headless operation. Send JSON commands on stdin: {"type":"prompt","message":"..."} or {"type":"abort"}. See test/rpc-example.ts for a complete example

--print, -p Non-interactive mode: process the prompt(s) and exit. Without this flag, passing a prompt starts interactive mode with the prompt pre-submitted. Similar to Claude's -p flag and Codex's exec command.

--no-session Don't save session (ephemeral mode)

--session Use specific session file path instead of auto-generating one

--continue, -c Continue the most recent session

--resume, -r Select a session to resume (opens interactive selector)

--models Comma-separated model patterns for quick cycling with Ctrl+P. Matching priority:

provider/modelId exact match (e.g., openrouter/openai/gpt-5.1-codex)
Exact model ID match (e.g., gpt-5.1-codex)
Partial match against model IDs and names (case-insensitive)

When multiple partial matches exist, prefers aliases over dated versions (e.g., claude-sonnet-4-5 over claude-sonnet-4-5-20250929). Without this flag, Ctrl+P cycles through all available models.

Each pattern can optionally include a thinking level suffix: pattern:level where level is one of off, minimal, low, medium, or high. When cycling models, the associated thinking level is automatically applied. The first model in the list is used as the initial model when starting a new session.

Examples:

--models openrouter/openai/gpt-5.1-codex - Exact provider/model match
--models gpt-5.1-codex - Exact ID match (not openai/gpt-5.1-codex-mini)
--models sonnet:high,haiku:low - Sonnet with high thinking, Haiku with low thinking
--models sonnet,haiku - Partial match for any model containing "sonnet" or "haiku"

--tools Comma-separated list of tools to enable. By default, pi uses read,bash,edit,write. This flag allows restricting or changing the available tools.

Available tools:

read - Read file contents
bash - Execute bash commands
edit - Make surgical edits to files
write - Create or overwrite files
grep - Search file contents for patterns (read-only, off by default)
find - Find files by glob pattern (read-only, off by default)
ls - List directory contents (read-only, off by default)

Examples:

--tools read,grep,find,ls - Read-only mode for code review/exploration
--tools read,bash - Only allow reading and bash commands

--thinking Set thinking level for reasoning-capable models. Valid values: off, minimal, low, medium, high. Takes highest priority over all other thinking level sources (saved settings, --models pattern levels, session restore).

Examples:

--thinking high - Start with high thinking level
--thinking off - Disable thinking even if saved setting was different

--export Export a session file to a self-contained HTML file and exit. Auto-detects format (session manager format or streaming event format). Optionally provide an output filename as the second argument.

Note: When exporting streaming event logs (e.g., pi-output.jsonl from --mode json), the system prompt and tool definitions are not available since they are not recorded in the event stream. The exported HTML will include a notice about this.

Examples:

--export session.jsonl - Export to pi-session-session.html
--export session.jsonl output.html - Export to custom filename

--help, -h Show help message

Examples

# Start interactive mode
pi

# Interactive mode with initial prompt (stays running after completion)
pi "List all .ts files in src/"

# Include files in your prompt
pi @requirements.md @design.png "Implement this feature"

# Non-interactive mode (process prompt and exit)
pi -p "List all .ts files in src/"

# Non-interactive with files
pi -p @code.ts "Review this code for bugs"

# JSON mode - stream all agent events (non-interactive)
pi --mode json "List all .ts files in src/"

# RPC mode - headless operation (see test/rpc-example.ts)
pi --mode rpc --no-session
# Then send JSON on stdin:
# {"type":"prompt","message":"List all .ts files"}
# {"type":"abort"}

# Continue previous session
pi -c "What did we discuss?"

# Use different model
pi --provider openai --model gpt-4o "Help me refactor this code"

# Limit model cycling to specific models
pi --models claude-sonnet,claude-haiku,gpt-4o
# Now Ctrl+P cycles only through those models

# Model cycling with thinking levels
pi --models sonnet:high,haiku:low
# Starts with sonnet at high thinking, Ctrl+P switches to haiku at low thinking

# Start with specific thinking level
pi --thinking high "Solve this complex algorithm problem"

# Read-only mode (no file modifications possible)
pi --tools read,grep,find,ls -p "Review the architecture in src/"

# Oracle-style subagent (bash for git/gh, no file modifications)
pi --tools read,bash,grep,find,ls \
   --no-session \
   -p "Use bash only for read-only operations. Read issue #74 with gh, then review the implementation"

# Export a session file to HTML
pi --export ~/.pi/agent/sessions/--myproject--/session.jsonl
pi --export session.jsonl my-export.html

Tools

Default Tools

By default, the agent has access to four core tools:

read Read file contents. Supports text files and images (jpg, png, gif, webp). Images are sent as attachments. For text files, defaults to first 2000 lines. Use offset/limit parameters for large files. Lines longer than 2000 characters are truncated.

write Write content to a file. Creates the file if it doesn't exist, overwrites if it does. Automatically creates parent directories.

edit Edit a file by replacing exact text. The oldText must match exactly (including whitespace). Use this for precise, surgical edits. Returns an error if the text appears multiple times or isn't found.

bash Execute a bash command in the current working directory. Returns stdout and stderr. Optionally accepts a timeout parameter (in seconds) - no default timeout.

Read-Only Exploration Tools

These tools are available via --tools flag for read-only code exploration:

grep Search file contents for a pattern (regex or literal). Returns matching lines with file paths and line numbers. Respects .gitignore. Parameters: pattern (required), path, glob, ignoreCase, literal, context, limit.

find Search for files by glob pattern (e.g., **/*.ts). Returns matching file paths relative to the search directory. Respects .gitignore. Parameters: pattern (required), path, limit.

ls List directory contents. Returns entries sorted alphabetically with / suffix for directories. Includes dotfiles. Parameters: path, limit.

MCP & Adding Your Own Tools

pi does and will not support MCP. Instead, it relies on the four built-in tools above and assumes the agent can invoke pre-existing CLI tools or write them on the fly as needed.

Here's the gist:

Create a simple CLI tool (any language, any executable)
Write a concise README.md describing what it does and how to use it
Tell the agent to read that README

Minimal example:

~/agent-tools/screenshot/README.md:

# Screenshot Tool

Takes a screenshot of your main display.

## Usage
```bash
screenshot.sh

Returns the path to the saved PNG file.


`~/agent-tools/screenshot/screenshot.sh`:
```bash
#!/bin/bash
screencapture -x /tmp/screenshot-$(date +%s).png
ls -t /tmp/screenshot-*.png | head -1

In your session:

You: Read ~/agent-tools/screenshot/README.md and use that tool to take a screenshot

The agent will read the README, understand the tool, and invoke it via bash as needed. If you need a new tool, ask the agent to write it for you.

You can also reference tool READMEs in your AGENTS.md files to make them automatically available:

Global: ~/.pi/agent/AGENTS.md - available in all sessions
Project-specific: ./AGENTS.md - available in this project

Real-world example:

The exa-search tools provide web search capabilities via the Exa API. Built by the agent itself in ~2 minutes. Far from perfect, but functional. Just tell your agent: "Read ~/agent-tools/exa-search/README.md and search for X".

For a detailed walkthrough with more examples, and the reasons for and benefits of this decision, see: https://mariozechner.at/posts/2025-11-02-what-if-you-dont-need-mcp/

Security (YOLO by default)

This agent runs in full YOLO mode and assumes you know what you're doing. It has unrestricted access to your filesystem and can execute any command without permission checks or safety rails.

What this means:

No permission prompts for file operations or commands
No pre-checking of bash commands for malicious content
Full filesystem access - can read, write, or delete anything
Can execute any command with your user privileges

Why:

Permission systems add massive friction while being easily circumvented
Pre-checking tools for "dangerous" patterns introduces latency, false positives, and is ineffective

Prompt injection risks:

By default, pi has no web search or fetch tool
However, it can use curl or read files from disk
Both provide ample surface area for prompt injection attacks
Malicious content in files or command outputs can influence behavior

Mitigations:

Run pi inside a container if you're uncomfortable with full access
Use a different tool if you need guardrails
Don't use pi on systems with sensitive data you can't afford to lose
Fork pi and add all of the above

This is how I want it to work and I'm not likely to change my stance on this.

Use at your own risk.

Sub-Agents

pi does not and will not support sub-agents as a built-in feature. If the agent needs to delegate work, it can:

Spawn another instance of itself via the pi CLI command
Write a custom tool with a README.md that describes how to invoke pi for specific tasks

Why no built-in sub-agents:

Context transfer between agents is generally poor. Information gets lost, compressed, or misrepresented when passed through agent boundaries. Direct execution with full context is more effective than delegation with summarized context.

If you need parallel work on independent tasks, manually run multiple pi sessions in different terminal tabs. You're the orchestrator.

To-Dos

pi does not and will not support built-in to-dos. In my experience, to-do lists generally confuse models more than they help.

If you need task tracking, make it stateful by writing to a file:

# TODO.md

- [x] Implement user authentication
- [x] Add database migrations
- [ ] Write API documentation
- [ ] Add rate limiting

The agent can read and update this file as needed. Using checkboxes keeps track of what's done and what remains. Simple, visible, and under your control.

Planning

pi does not and will not have a built-in planning mode. Telling the agent to think through a problem together with you, without modifying files or executing commands, is generally sufficient.

If you need persistent planning across sessions, write it to a file:

# PLAN.md

## Goal
Refactor authentication system to support OAuth

## Approach
1. Research OAuth 2.0 flows
2. Design token storage schema
3. Implement authorization server endpoints
4. Update client-side login flow
5. Add tests

## Current Step
Working on step 3 - authorization endpoints

The agent can read, update, and reference the plan as it works. Unlike ephemeral planning modes that only exist within a session, file-based plans persist and can be versioned with your code.

Background Bash

pi does not and will not implement background bash execution. Instead, tell the agent to use tmux or something like tterminal-cp. Bonus points: you can watch the agent interact with a CLI like a debugger and even intervene if necessary.

Development

Forking / Rebranding

All branding (app name, config directory) is configurable via package.json:

{
  "piConfig": {
    "name": "pi",
    "configDir": ".pi"
  }
}

To create a fork with different branding:

Change piConfig.name to your app name (e.g., "tau")
Change piConfig.configDir to your config directory (e.g., ".tau")
Change the bin field to your command name: "bin": { "tau": "dist/cli.js" }

This affects:

CLI banner and help text
Config directory (~/.pi/agent/ → ~/.tau/agent/)
Environment variable name (PI_CODING_AGENT_DIR → TAU_CODING_AGENT_DIR)
All user-facing paths in error messages and documentation

Path Resolution

The codebase supports three execution modes:

npm: Running via node dist/cli.js after npm install
Bun binary: Standalone compiled binary with files alongside
tsx: Running directly from source via npx tsx src/cli.ts

All path resolution for package assets (package.json, README.md, CHANGELOG.md, themes) must go through src/paths.ts:

import { getPackageDir, getThemeDir, getPackageJsonPath, getReadmePath, getChangelogPath } from "./paths.js";

Never use __dirname directly for resolving package assets. The paths.ts module handles the differences between execution modes automatically.

Debug Command

The /debug command is a hidden development feature (not shown in autocomplete) that writes all currently rendered lines with their visible widths and ANSI escape sequences to ~/.pi/agent/pi-debug.log. This is useful for debugging TUI rendering issues, especially when lines don't extend to the terminal edge or contain unexpected invisible characters.

/debug

The debug log includes:

Terminal width at time of capture
Total number of rendered lines
Each line with its index, visible width, and JSON-escaped content showing all ANSI codes

License

MIT

46 KiB Raw Blame History

pi

Table of Contents

Installation

npm (recommended)

Standalone Binary

Build Binary from Source

Quick Start

API Keys

OAuth Authentication (Optional)

Custom Models and Providers

Configuration File Structure

API Key Resolution

API Override

Custom Headers

Authorization Header

Model Selection Priority

Provider Defaults

Live Reload & Errors

Example: Adding Ollama Models

Themes

Selecting a Theme

Custom Themes

VS Code Terminal Color Issue

Slash Commands

/model

/thinking

/queue

/export [filename]

/session

/changelog

/branch

/resume

/login

/logout

/clear

/copy

/compact

/autocompact

Custom Slash Commands

Editor Features

File Reference (@)

Path Completion

File Drag & Drop

Multi-line Paste

Message Queuing

Keyboard Shortcuts

Project Context Files

File Locations

What to Include

Example

Image Support

Session Management

Context Compaction

How It Works

Manual Compaction

Automatic Compaction

Configuration

Supported Modes

CLI Options

File Arguments (@file)

Options

Examples

Tools

Default Tools

Read-Only Exploration Tools

MCP & Adding Your Own Tools

Security (YOLO by default)

Sub-Agents

To-Dos

Planning

Background Bash

Development

Forking / Rebranding

Path Resolution

Debug Command

License

See Also

46 KiB

Raw Blame History

File Reference (`@`)

File Arguments (`@file`)