* use the correct Gemini 3 Flash Preview thinking levels
* fix a build error
* add changelog entry
* regenerate models
* make less assumptions about future models
- Fix external dependency name (pi-agent-core -> pi-agent)
- Add missing files: hooks/, custom-tools/, skills.ts, github-copilot.ts
- Update components list with all 18 current components
- Add Key Abstractions sections for hooks, custom tools, skills
- Add 'Adding a New Hook Event' to development guide
- Move config.ts to correct location in directory tree
- Custom tools: TypeScript modules that extend pi with new tools
- Custom TUI rendering via renderCall/renderResult
- User interaction via pi.ui (select, confirm, input, notify)
- Session lifecycle via onSession callback for state reconstruction
- Examples: todo.ts, question.ts, hello.ts
- Hook examples: permission-gate, git-checkpoint, protected-paths
- Session lifecycle centralized in AgentSession
- Works across all modes (interactive, print, RPC)
- Unified session event for hooks (replaces session_start/session_switch)
- Box component added to pi-tui
- Examples bundled in npm and binary releases
Fixes#190
Previous test used compressed 8k images (0.01MB) which was meaningless.
Now tests with actual large noise images that don't compress.
Realistic payload limits discovered:
- Anthropic: 6 x 3MB = ~18MB total (not 32MB as documented)
- OpenAI: 2 x 15MB = ~30MB total
- Gemini: 10 x 20MB = ~200MB total (very permissive)
- Mistral: 4 x 10MB = ~40MB total
- xAI: 1 x 20MB (strict request size limit)
- Groq: 5 x 5760px images (5 image + pixel limit)
- zAI: 2 x 15MB = ~30MB (50MB request limit)
- OpenRouter: 2 x 5MB = ~10MB total
Also fixed GEMINI_API_KEY env var (was GOOGLE_API_KEY).
Related to #120
Tested max 8kx8k images per provider:
- Anthropic: 100 (explicit limit, fails at 101)
- OpenAI: 100-200 (100 works, 200 times out)
- Mistral: 8 (explicit limit, fails at 9)
- xAI: 100-150 (100 works, 150 times out)
- Groq: 0 (8k exceeds 33M pixel limit)
- zAI: 400 (context window limited at 500)
- OpenRouter: 40 (context window limited at 50)
- Gemini: untested (no API key in test env)
Key finding: Anthropic's 'many images' rule does NOT cause API errors.
100 x 8kx8k images work fine. Anthropic likely auto-resizes internally.
Related to #120
Tests max image count, size, dimensions, and 8k stress test for:
- Anthropic, OpenAI, Gemini, Mistral, OpenRouter, xAI, Groq, zAI
Key finding: Anthropic's 'many images' rule (>20 images = 2000px max)
does NOT cause API errors. 100 x 8k images work fine. Anthropic likely
auto-resizes internally.
Related to #120
Tests provider-specific image limitations across all supported providers:
- Maximum number of images in context
- Maximum image size (bytes)
- Maximum image dimensions
Discovered limits (Dec 2025):
- Anthropic: 100 images, 5MB per image, 8000px max dimension
- OpenAI: 500 images, >=25MB per image
- Gemini: ~2500 images, >=40MB per image
- Mistral: 8 images, ~15MB per image
- OpenRouter: ~40 images (context limited), ~15MB per image
- Add AgentToolUpdateCallback type and optional onUpdate callback to AgentTool.execute()
- Add tool_execution_update event with toolCallId, toolName, args, partialResult
- Normalize tool_execution_end to always use AgentToolResult (no more string fallback)
- Bash tool streams truncated rolling buffer output during execution
- ToolExecutionComponent shows last N lines when collapsed (not first N)
- Interactive mode handles tool_execution_update events
- Update RPC docs and ai/agent READMEs
fixes#44
wrapTextWithAnsi() was processing each line independently after splitting
on newlines, losing ANSI state. When styled text contained embedded newlines
(e.g. from markdown paragraphs), subsequent lines would lose their styling.
Fixed by tracking ANSI state across lines and prepending active codes to
lines after the first.
Fixes#197
Previously, when reasoning was not specified, some providers like Gemini
with 'dynamic thinking' enabled by default would still use thinking.
Now explicitly sets thinkingEnabled: false (Anthropic) and
thinking: { enabled: false } (Google) when reasoning is undefined.
Closes#180