When a provider (e.g., Google Gemini CLI) requests a retry delay longer
than maxDelayMs (default: 60s), the request fails immediately with an
informative error instead of waiting silently for hours.
The error is then handled by agent-level auto-retry, which shows the
delay to the user and allows aborting with Escape.
- Add maxRetryDelayMs to StreamOptions (packages/ai)
- Add maxRetryDelayMs to AgentOptions (packages/agent)
- Add retry.maxDelayMs to settings (packages/coding-agent)
- Update _isRetryableError to match 'retry delay' errors
fixes#1123
- Add sessionId to StreamOptions for providers that support session-based caching
- OpenAI Codex provider uses sessionId for prompt_cache_key and routing headers
- Agent class now accepts and forwards sessionId to stream functions
- coding-agent passes session ID from SessionManager and updates on session changes
- Update ai package README with table of contents, OpenAI Codex OAuth docs, and env vars table
- Increase Codex instructions cache TTL from 15 minutes to 24 hours
- Add tests for sessionId forwarding in ai and agent packages
- Remove per-thinking-level model variants (gpt-5.2-codex-high, etc.)
- Remove thinkingLevels from Model type
- Provider clamps reasoning effort internally
- Omit reasoning field when thinking is off
fixes#472
Breaking change: replaces queueMessage() with two separate methods:
- steer(msg): interrupt mid-run, delivered after current tool execution
- followUp(msg): wait until agent finishes before delivery
Also renames:
- queueMode -> steeringMode/followUpMode
- getQueuedMessages -> getSteeringMessages/getFollowUpMessages
Refs #403
Agent.prompt() and Agent.continue() now throw if called while already
streaming, preventing race conditions and corrupted state. Use
queueMessage() to queue messages during streaming, or await the
previous call.
AgentSession.prompt() has the same guard with a message directing
users to queueMessage().
Ref #403
- agentLoop now accepts AgentMessage[] instead of single message
- agent.prompt() accepts AgentMessage | AgentMessage[]
- Emits message_start/end for each message in the array
- AgentSession.prompt() builds array with hook message + user message
- TUI now receives events for before_agent_start injected messages
- Renamed AppMessage to AgentMessage throughout
- New agent-loop.ts with AgentLoopContext, AgentLoopConfig
- Removed transport abstraction, Agent now takes streamFn directly
- Extracted streamProxy to proxy.ts utility
- Removed agent-loop from pi-ai (now in agent package)
- Updated consumers (coding-agent, mom) for AgentMessage rename
- Tests updated but some consumers still need migration
Known issues:
- AgentTool, AgentToolResult not exported from pi-ai
- Attachment not exported from pi-agent-core
- ProviderTransport removed but still referenced
- messageTransformer -> convertToLlm migration incomplete
- CustomMessages declaration merging not working properly
HookMessage (role: hookMessage) was being passed directly to transport
without transformation. Now it's transformed via messageTransformer
which converts it to a proper user message for the LLM.
- Rename HookAppMessage to HookMessage, isHookAppMessage to isHookMessage
- Remove entries array from SessionContext (use isHookMessage type guard instead)
- HookMessage.content now accepts string directly (not just array)
- Fix streamMessage type in AgentState (AppMessage, not Message)
- Rename CustomMessageComponent to HookMessageComponent
- Fix test hook to use pi.sendMessage
- Change from contextTransform (runs once at agent start) to preprocessor
- preprocessor runs before EACH LLM call inside the agent loop
- ContextEvent now uses Message[] (pi-ai format) instead of AppMessage[]
- Deep copy handled by pi-ai preprocessor, not Agent
This enables:
- Pruning rules applied on every turn (not just agent start)
- /prune during long agent loop takes effect immediately
- Compaction can use same transforms (future work)
- Add contextTransform option to Agent (runs before messageTransformer)
- Deep copy messages before passing to contextTransform (modifications are ephemeral)
- Add ContextEvent and ContextEventResult types
- Add emitContext() to HookRunner (chains multiple handlers)
- Wire up in sdk.ts when creating Agent with hooks
Enables dynamic context pruning: hooks can modify messages sent to LLM
without changing session data. See discussion #330.
Cleans up the temporary emitLastMessage plumbing since we now use
Agent.prompt(AppMessage) for hook messages instead of appendMessage+continue.
- Remove emitLastMessage parameter from Agent.continue()
- Remove from transport interface and implementations
- Remove from agentLoopContinue()
Instead of using continue() which validates roles, prompt() now accepts
an AppMessage directly. This allows hook messages with role: 'hookMessage'
to trigger proper agent loop with message events.
- Add overloads: prompt(AppMessage) and prompt(string, attachments?)
- sendHookMessage uses prompt(appMessage) instead of appendMessage+continue
When calling continue() with emitLastMessage=true, the agent loop
emits message_start/message_end events for the last message in context.
This allows messages added outside the loop (e.g., hook messages via
sendHookMessage) to trigger proper TUI rendering.
Changes across packages:
- packages/ai: agentLoopContinue() accepts emitLastMessage parameter
- packages/agent: Agent.continue(), transports updated to pass flag
- packages/coding-agent: sendHookMessage passes true when triggerTurn
- Add agentLoopContinue() to pi-ai for resuming from existing context
- Add Agent.continue() method and transport.continue() interface
- Simplify AgentSession compaction to two cases: overflow (auto-retry) and threshold (no retry)
- Remove proactive mid-turn compaction abort
- Merge turn prefix summary into main summary
- Add isCompacting property to AgentSession and RPC state
- Block input during compaction in interactive mode
- Show compaction count on session resume
- Rename RPC.md to rpc.md for consistency
Related to #128
Previously, Agent.emit() was called before state was updated (e.g., appendMessage).
This meant event handlers saw stale state - when message_end fired,
agent.state.messages didn't include the message yet.
Now state is updated first, then events are emitted, so handlers see
consistent state that matches the event.
- Added totalTokens field to Usage interface in pi-ai
- Anthropic: computed as input + output + cacheRead + cacheWrite
- OpenAI/Google: uses native total_tokens/totalTokenCount
- Fixed openai-completions to compute totalTokens when reasoning tokens present
- Updated calculateContextTokens() to use totalTokens field
- Added comprehensive test covering 13 providers
fixes#130
Previously, errors in turn_end events (e.g., from OpenRouter Auto Router)
were not captured in agent.state.error, making failed requests appear as
successful completions.
Fixes#6
- Added string-width library for proper terminal column width calculation
- Fixed wrapLine() to split by newlines before wrapping (like Text component)
- Fixed Loader interval leak by stopping before container removal
- Changed loader message from 'Loading...' to 'Working...'
Fixed the interrupt mechanism to show "[Interrupted by user]" message when ESC is pressed:
- Removed duplicate UI cleanup from ESC key handler that interfered with event processing
- Added centralized interrupted event emission in exception handler when abort signal is detected
- Removed duplicate event emissions from API call methods to prevent multiple messages
- Added abort signal support to preflight reasoning check for proper cancellation
- Simplified abort detection to only check signal state, not error messages
- renderDifferential now correctly handles content that exceeds viewport
- When changes are above viewport, do full re-render with scrollback clear
- When changes are in viewport, do partial re-render from change point
- All tests pass, correctly preserves 100 items in scrollback
- Issue: Still re-renders too much (entire tail from first change)
- Remove redundant 'Reasoning Tokens' column (all models count them)
- Group by provider for better readability
- Clarify model limitations vs API limitations
- Simplify check marks to focus on thinking content availability
Added provider-specific reasoning/thinking token support for:
- OpenAI (o1, o3, gpt-5): Full reasoning events via Responses API, token counts via Chat Completions
- Groq: reasoning_format:"parsed" for Chat Completions, no summary support for Responses
- Gemini 2.5: extra_body.google.thinking_config with <thought> tag extraction
- OpenRouter: Unified reasoning parameter with message.reasoning field
- Anthropic: Limited support via OpenAI compatibility layer
Key improvements:
- Centralized provider detection based on baseURL
- parseReasoningFromMessage() extracts provider-specific reasoning content
- adjustRequestForProvider() handles provider-specific request modifications
- Smart reasoning support detection with caching per API type
- Comprehensive README documentation with provider support matrix
Fixes reasoning tokens not appearing for GPT-5 and other reasoning models.
- Extract and display reasoning tokens from both Chat Completions and Responses APIs
- Add smart preflight detection to check reasoning support per model/API (cached per agent)
- Support both reasoning_text (o1/o3) and summary_text (gpt-5) formats
- Display reasoning tokens with ⚡ symbol in console and TUI renderers
- Only send reasoning parameters to models that support them
- Fix event type from "thinking" to "reasoning" for consistency
Note: Chat Completions API only returns reasoning token counts, not content (by design).
Only Responses API exposes actual thinking/reasoning events.
- Set up npm workspaces for three packages: pi-tui, pi-agent, and pi (pods)
- Implemented dual TypeScript configuration:
- Root tsconfig.json with path mappings for development and type checking
- Package-specific tsconfig.build.json for clean production builds
- Configured lockstep versioning with sync script for inter-package dependencies
- Added comprehensive documentation for development and publishing workflows
- All packages at version 0.5.0 ready for npm publishing