mirror of
https://github.com/getcompanion-ai/co-mono.git
synced 2026-04-16 07:04:25 +00:00
Remove todos folder.
This commit is contained in:
parent
53e339ddb8
commit
12910a5940
20 changed files with 0 additions and 2495 deletions
|
|
@ -1,267 +0,0 @@
|
|||
# Analysis: Display Tool Call Metrics
|
||||
|
||||
## Token Usage Display in the Agent Code
|
||||
|
||||
### 1. Token Usage Event Structure
|
||||
The token usage is defined as an event type in `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts` (lines 16-23):
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: "token_usage";
|
||||
inputTokens: number;
|
||||
outputTokens: number;
|
||||
totalTokens: number;
|
||||
cacheReadTokens: number;
|
||||
cacheWriteTokens: number;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Where Token Usage Events are Generated
|
||||
Token usage events are created in two places in `agent.ts`:
|
||||
|
||||
**Responses API (lines 77-88):**
|
||||
```typescript
|
||||
if (response.usage) {
|
||||
const usage = response.usage;
|
||||
eventReceiver?.on({
|
||||
type: "token_usage",
|
||||
inputTokens: usage.input_tokens || 0,
|
||||
outputTokens: usage.output_tokens || 0,
|
||||
totalTokens: usage.total_tokens || 0,
|
||||
cacheReadTokens: usage.input_tokens_details.cached_tokens || 0,
|
||||
cacheWriteTokens: 0, // Not available in API
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
**Chat Completions API (lines 209-220):**
|
||||
```typescript
|
||||
if (response.usage) {
|
||||
const usage = response.usage;
|
||||
await eventReceiver?.on({
|
||||
type: "token_usage",
|
||||
inputTokens: usage.prompt_tokens || 0,
|
||||
outputTokens: usage.completion_tokens || 0,
|
||||
totalTokens: usage.total_tokens || 0,
|
||||
cacheReadTokens: usage.prompt_tokens_details?.cached_tokens || 0,
|
||||
cacheWriteTokens: 0, // Not available in API
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Token Display in Different Renderers
|
||||
|
||||
#### Console Renderer (`console-renderer.ts`)
|
||||
- **No display**: Token usage events are explicitly **not displayed** in console mode
|
||||
- Lines 47-48: Token usage events don't stop animations
|
||||
- Lines 124-127: Token usage case does nothing - no console output
|
||||
|
||||
#### TUI Renderer (`tui-renderer.ts`)
|
||||
- **Full token display**: Shows detailed token information at the bottom of the interface
|
||||
- **Format**: `↑{input_tokens} ↓{output_tokens} (⟲{cache_read_tokens} ⟳{cache_write_tokens})`
|
||||
- **Location**: Bottom container of the TUI interface
|
||||
- **Example display**: `↑1,234 ↓567 (⟲890 ⟳0)`
|
||||
|
||||
Key implementation details:
|
||||
- Lines 60-64: Stores token counters as instance variables
|
||||
- Lines 258-265: Updates token counts when token_usage events are received
|
||||
- Lines 284-305: `updateTokenDisplay()` method formats and displays tokens
|
||||
- Lines 289-301: Format includes cache information if available
|
||||
- Uses symbols: `↑` (up arrow) for input, `↓` (down arrow) for output, `⟲` (anticlockwise) for cache read, `⟳` (clockwise) for cache write
|
||||
|
||||
#### JSON Renderer (`json-renderer.ts`)
|
||||
- **Raw JSON output**: Outputs the complete token_usage event as JSON
|
||||
- Line 5: Simply calls `console.log(JSON.stringify(event))`
|
||||
- This means token usage is included in the JSON stream when using `--json` flag
|
||||
|
||||
### 4. Session Storage (`session-manager.ts`)
|
||||
- **Persistence**: Token usage events are stored in session files
|
||||
- **Tracking**: Maintains `totalUsage` which stores the latest token usage event (lines 33, 138-145, 157-159)
|
||||
- **Note**: Currently stores only the latest token usage, not cumulative totals across the session
|
||||
|
||||
### 5. Key Characteristics
|
||||
|
||||
**Token Display Behavior:**
|
||||
- **Console mode**: No token display (silent)
|
||||
- **TUI mode**: Real-time token display at bottom with visual indicators
|
||||
- **JSON mode**: Raw event data in JSON format
|
||||
|
||||
**Token Types Displayed:**
|
||||
- Input tokens (prompt tokens)
|
||||
- Output tokens (completion tokens)
|
||||
- Cache read tokens (when available)
|
||||
- Cache write tokens (currently always 0 - not available from APIs)
|
||||
|
||||
**Display Format in TUI:**
|
||||
- Uses thousands separators (`.toLocaleString()`)
|
||||
- Dimmed text styling with chalk
|
||||
- Visual symbols to distinguish token types
|
||||
- Only shows cache info if cache tokens > 0
|
||||
|
||||
The token usage system provides comprehensive tracking across different output modes, with the TUI renderer offering the most user-friendly real-time display of token consumption.
|
||||
|
||||
## Tool Call Tracking in the Agent
|
||||
|
||||
### 1. **AgentEvent Structure for Tool Calls**
|
||||
|
||||
The agent defines a specific event structure for tool calls in `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts`:
|
||||
|
||||
```typescript
|
||||
export type AgentEvent =
|
||||
// ... other events
|
||||
| { type: "tool_call"; toolCallId: string; name: string; args: string }
|
||||
| { type: "tool_result"; toolCallId: string; result: string; isError: boolean }
|
||||
// ... other events
|
||||
```
|
||||
|
||||
### 2. **Where Tool Call Events are Emitted**
|
||||
|
||||
Tool call events are tracked in two places:
|
||||
|
||||
**Responses API (lines 131-136):**
|
||||
```typescript
|
||||
await eventReceiver?.on({
|
||||
type: "tool_call",
|
||||
toolCallId: item.call_id || "",
|
||||
name: item.name,
|
||||
args: item.arguments,
|
||||
});
|
||||
```
|
||||
|
||||
**Chat Completions API (line 243):**
|
||||
```typescript
|
||||
await eventReceiver?.on({
|
||||
type: "tool_call",
|
||||
toolCallId: toolCall.id,
|
||||
name: funcName,
|
||||
args: funcArgs
|
||||
});
|
||||
```
|
||||
|
||||
### 3. **Event Processing and Storage**
|
||||
|
||||
- **Session Manager**: All AgentEvents (including tool calls) are stored in `/Users/badlogic/workspaces/pi-mono/packages/agent/src/session-manager.ts`. Each event is logged to a JSONL file with timestamps.
|
||||
|
||||
- **Event Reception**: Events are processed by the `SessionManager.on()` method (line 124), which appends each event to the session file.
|
||||
|
||||
### 4. **Current State: No Tool Call Counter**
|
||||
|
||||
**Key Finding**: There is currently **no built-in tool call counter or statistics tracking** in the agent. The system only:
|
||||
|
||||
- Stores individual tool call events in session files
|
||||
- Tracks token usage separately via `token_usage` events
|
||||
- Does not aggregate or count tool calls
|
||||
|
||||
### 5. **Event Reconstruction**
|
||||
|
||||
The agent can reconstruct conversation state from events using `setEvents()` method, which processes tool call events to rebuild the message history, but it doesn't count them.
|
||||
|
||||
### 6. **Renderers Handle Display Only**
|
||||
|
||||
The renderers (`ConsoleRenderer`, `TuiRenderer`, `JsonRenderer`) only display tool call events as they happen - they don't maintain counts or statistics.
|
||||
|
||||
## Summary
|
||||
|
||||
The agent architecture is set up to track individual tool call events through the `AgentEvent` system, but there's currently **no aggregation, counting, or statistical analysis** of tool calls. Each tool call generates:
|
||||
|
||||
1. A `tool_call` event when initiated
|
||||
2. A `tool_result` event when completed
|
||||
|
||||
These events are stored in session files but not counted or analyzed. To implement tool call counting, you would need to add functionality that:
|
||||
|
||||
- Counts `tool_call` events in session data
|
||||
- Potentially extends `SessionData` interface to include tool call statistics
|
||||
- Adds methods to analyze tool call patterns across sessions
|
||||
|
||||
## Renderer Implementation Analysis
|
||||
|
||||
### 1. Console Renderer (`/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/console-renderer.ts`)
|
||||
|
||||
**Key Findings:**
|
||||
- **Token Usage Handling:** Token usage events are explicitly **ignored** in console mode (lines 124-127)
|
||||
- **Icons/Emojis Used:** Spinning animation frames for loading states: `["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]`
|
||||
- **Color Coding:** Uses chalk for different message types:
|
||||
- Blue for session start
|
||||
- Orange (`#FFA500`) for assistant messages
|
||||
- Yellow for tool calls
|
||||
- Gray for tool results
|
||||
- Red for errors
|
||||
- Green for user messages
|
||||
- **Animation:** Loading animations during thinking/processing states
|
||||
|
||||
### 2. JSON Renderer (`/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/json-renderer.ts`)
|
||||
|
||||
**Key Findings:**
|
||||
- **Minimal Implementation:** Simply outputs all events as JSON strings
|
||||
- **Token Usage:** Passes through all token usage data as-is in JSON format
|
||||
- **No Visual Formatting:** Raw JSON output only
|
||||
|
||||
### 3. TUI Renderer (`/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/tui-renderer.ts`)
|
||||
|
||||
**Key Findings:**
|
||||
- **Token Usage Display Location:** Lines 284-305 contain the `updateTokenDisplay()` method
|
||||
- **Icons/Symbols Used:**
|
||||
- `↑` for input tokens
|
||||
- `↓` for output tokens
|
||||
- `⟲` for cache read tokens
|
||||
- `⟳` for cache write tokens
|
||||
- Same spinning frames as console: `["⠋", "⠙", "⠹", "⠸", "⠼", "⠴", "⠦", "⠧", "⠇", "⠏"]`
|
||||
|
||||
**Token Display Implementation (lines 284-305):**
|
||||
```typescript
|
||||
private updateTokenDisplay(): void {
|
||||
// Clear and update token display
|
||||
this.tokenContainer.clear();
|
||||
|
||||
// Build token display text
|
||||
let tokenText = chalk.dim(`↑${this.lastInputTokens.toLocaleString()} ↓${this.lastOutputTokens.toLocaleString()}`);
|
||||
|
||||
// Add cache info if available
|
||||
if (this.lastCacheReadTokens > 0 || this.lastCacheWriteTokens > 0) {
|
||||
const cacheText: string[] = [];
|
||||
if (this.lastCacheReadTokens > 0) {
|
||||
cacheText.push(`⟲${this.lastCacheReadTokens.toLocaleString()}`);
|
||||
}
|
||||
if (this.lastCacheWriteTokens > 0) {
|
||||
cacheText.push(`⟳${this.lastCacheWriteTokens.toLocaleString()}`);
|
||||
}
|
||||
tokenText += chalk.dim(` (${cacheText.join(" ")})`);
|
||||
}
|
||||
|
||||
this.tokenStatusComponent = new TextComponent(tokenText);
|
||||
this.tokenContainer.addChild(this.tokenStatusComponent);
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Available Token Usage Data Structure
|
||||
|
||||
Based on the `AgentEvent` type definition in `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts`:
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: "token_usage";
|
||||
inputTokens: number;
|
||||
outputTokens: number;
|
||||
totalTokens: number;
|
||||
cacheReadTokens: number;
|
||||
cacheWriteTokens: number;
|
||||
}
|
||||
```
|
||||
|
||||
## How to Add New Metrics
|
||||
|
||||
### For TUI Renderer:
|
||||
1. **Location:** Modify the `updateTokenDisplay()` method in `tui-renderer.ts` (lines 284-305)
|
||||
2. **Storage:** Add new private properties to store additional metrics (like lines 60-64)
|
||||
3. **Event Handling:** Update the `token_usage` case to capture new metrics (lines 258-265)
|
||||
4. **Display:** Add new symbols/formatting to the token display string
|
||||
|
||||
### For Console Renderer:
|
||||
1. **Enable Token Display:** Remove the ignore logic in lines 124-127
|
||||
2. **Add Display Logic:** Create a new case to format and display token metrics
|
||||
3. **Positioning:** Add after processing is complete to avoid interfering with animations
|
||||
|
||||
### For JSON Renderer:
|
||||
- No changes needed - it automatically outputs all event data as JSON
|
||||
|
||||
The TUI renderer provides the most comprehensive token usage display with visual symbols and formatting, while the console renderer currently ignores token usage entirely. The JSON renderer provides raw data output suitable for programmatic consumption.
|
||||
|
|
@ -1,30 +0,0 @@
|
|||
# Display Tool Call Metrics
|
||||
|
||||
**Status:** Done
|
||||
**Agent PID:** 96631
|
||||
|
||||
## Original Todo
|
||||
agent: we should output number of tool calls so far next to input and output and cached tokens. Can use that hammer emoji or whatever.
|
||||
|
||||
## Description
|
||||
Add a tool call counter to the token usage display in the agent's TUI and console renderers, showing the number of tool calls made in the current conversation alongside the existing token metrics.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Add tool call counter property to TUI renderer (packages/agent/src/renderers/tui-renderer.ts:60-64)
|
||||
- [x] Track tool_call events in TUI renderer's event handler (packages/agent/src/renderers/tui-renderer.ts:250-270)
|
||||
- [x] Update TUI token display to show tool calls with ⚒ (packages/agent/src/renderers/tui-renderer.ts:284-305)
|
||||
- [x] Add tool call counter to console renderer (packages/agent/src/renderers/console-renderer.ts)
|
||||
- [x] Track tool_call events in console renderer (packages/agent/src/renderers/console-renderer.ts:45-130)
|
||||
- [x] Display tool metrics after assistant messages in console (packages/agent/src/renderers/console-renderer.ts:124-127)
|
||||
- [x] Test console mode: `npx tsx packages/agent/src/cli.ts "what files are in /tmp"`
|
||||
- Success: After response completes, shows metrics line with tool count like "↑123 ↓456 ⚒1" ✓
|
||||
- [x] Test multiple tools: `npx tsx packages/agent/src/cli.ts "create a file /tmp/test.txt with 'hello' then read it back"`
|
||||
- Success: Should show ⚒2 (one for write, one for read) ✓
|
||||
- [x] Test JSON mode: `echo '{"type":"message","content":"list files in /tmp"}' | npx tsx packages/agent/src/cli.ts --json | grep tool_call | wc -l`
|
||||
- Success: Count matches number of tool_call events in output ✓ (shows 1 tool call)
|
||||
- [x] User test: Start interactive TUI `npx tsx packages/agent/src/cli.ts`, ask it to use multiple tools, verify counter increments live ✓
|
||||
|
||||
## Notes
|
||||
Using ⚒ (hammer and pick) symbol for tool calls
|
||||
|
|
@ -1,111 +0,0 @@
|
|||
# Analysis: Thinking Tokens Handling in Pi-Agent
|
||||
|
||||
Based on my comprehensive search of the codebase, I found extensive thinking token handling implementation in the pi-agent package. Here's my detailed analysis:
|
||||
|
||||
## Current Implementation Overview
|
||||
|
||||
The pi-agent codebase already has **comprehensive thinking token support** implemented in `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts`. The implementation covers both OpenAI's Responses API and Chat Completions API.
|
||||
|
||||
## Key Findings
|
||||
|
||||
### 1. **Thinking Token Event Type Defined**
|
||||
The `AgentEvent` type includes a dedicated `thinking` event:
|
||||
```typescript
|
||||
export type AgentEvent =
|
||||
// ... other event types
|
||||
| { type: "thinking"; text: string }
|
||||
// ... other event types
|
||||
```
|
||||
|
||||
### 2. **Responses API Implementation (Lines 103-110)**
|
||||
For the Responses API (used by GPT-OSS and potentially GPT-5 models), thinking tokens are already parsed:
|
||||
```typescript
|
||||
case "reasoning": {
|
||||
for (const content of item.content || []) {
|
||||
if (content.type === "reasoning_text") {
|
||||
await eventReceiver?.on({ type: "thinking", text: content.text });
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. **Token Usage Tracking**
|
||||
Both API implementations properly track token usage with support for:
|
||||
- Input tokens (`inputTokens`)
|
||||
- Output tokens (`outputTokens`)
|
||||
- Cache read tokens (`cacheReadTokens`)
|
||||
- Cache write tokens (`cacheWriteTokens`)
|
||||
|
||||
### 4. **UI Rendering Support**
|
||||
Both console and TUI renderers have explicit support for thinking events:
|
||||
|
||||
**Console Renderer** (`console-renderer.ts:99-106`):
|
||||
```typescript
|
||||
case "thinking":
|
||||
this.stopAnimation();
|
||||
console.log(chalk.dim("[thinking]"));
|
||||
console.log(chalk.dim(event.text));
|
||||
console.log();
|
||||
// Resume animation after showing thinking
|
||||
this.startAnimation("Processing");
|
||||
break;
|
||||
```
|
||||
|
||||
**TUI Renderer** (`tui-renderer.ts:188-201`):
|
||||
```typescript
|
||||
case "thinking": {
|
||||
// Show thinking in dim text
|
||||
const thinkingContainer = new Container();
|
||||
thinkingContainer.addChild(new TextComponent(chalk.dim("[thinking]")));
|
||||
// Split thinking text into lines for better display
|
||||
const thinkingLines = event.text.split("\n");
|
||||
for (const line of thinkingLines) {
|
||||
thinkingContainer.addChild(new TextComponent(chalk.dim(line)));
|
||||
}
|
||||
thinkingContainer.addChild(new WhitespaceComponent(1));
|
||||
this.chatContainer.addChild(thinkingContainer);
|
||||
break;
|
||||
}
|
||||
```
|
||||
|
||||
## Potential Issues Identified
|
||||
|
||||
### 1. **GPT-5 API Compatibility**
|
||||
The current implementation assumes GPT-5 models work with the Chat Completions API (`callModelChatCompletionsApi`), but GPT-5 models might need the Responses API (`callModelResponsesApi`) to access thinking tokens. The agent defaults to `"completions"` API type.
|
||||
|
||||
### 2. **Missing Thinking Token Usage in Chat Completions API**
|
||||
The Chat Completions API implementation doesn't parse or handle thinking/reasoning content - it only handles regular message content and tool calls. However, based on the web search results, GPT-5 models support reasoning tokens even in Chat Completions API.
|
||||
|
||||
### 3. **Model-Specific API Detection**
|
||||
There's no automatic detection of which API to use based on the model name. The default model is `"gpt-5-mini"` but uses `api: "completions"`.
|
||||
|
||||
## Anthropic Models Support
|
||||
|
||||
For Anthropic models accessed via the OpenAI SDK compatibility layer, the current Chat Completions API implementation should work, but there might be missing thinking token extraction if Anthropic returns reasoning content in a different format than standard OpenAI models.
|
||||
|
||||
## Recommendations
|
||||
|
||||
### 1. **Add Model-Based API Detection**
|
||||
Implement automatic API selection based on model names:
|
||||
```typescript
|
||||
function getApiTypeForModel(model: string): "completions" | "responses" {
|
||||
if (model.includes("gpt-5") || model.includes("o1") || model.includes("o3")) {
|
||||
return "responses";
|
||||
}
|
||||
return "completions";
|
||||
}
|
||||
```
|
||||
|
||||
### 2. **Enhanced Chat Completions API Support**
|
||||
If GPT-5 models can return thinking tokens via Chat Completions API, the implementation needs to be enhanced to parse reasoning content from the response.
|
||||
|
||||
### 3. **Anthropic-Specific Handling**
|
||||
Add specific logic for Anthropic models to extract thinking content if they provide it in a non-standard format.
|
||||
|
||||
## Files to Examine/Modify
|
||||
|
||||
1. **`/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts`** - Core API handling
|
||||
2. **`/Users/badlogic/workspaces/pi-mono/packages/agent/src/main.ts`** - Default configuration and model setup
|
||||
|
||||
The codebase already has a solid foundation for thinking token support, but may need model-specific API routing and enhanced parsing logic to fully support GPT-5 and Anthropic thinking tokens.
|
||||
|
|
@ -1,58 +0,0 @@
|
|||
# Fix Missing Thinking Tokens for GPT-5 and Anthropic Models
|
||||
**Status:** Done
|
||||
**Agent PID:** 72653
|
||||
|
||||
## Original Todo
|
||||
agent: we do not get thinking tokens for gpt-5. possibly also not for anthropic models?
|
||||
|
||||
## Description
|
||||
The agent doesn't extract or report reasoning/thinking tokens from OpenAI's reasoning models (gpt-5, o1, o3) when using the Chat Completions API. While the codebase has full thinking token support for the Responses API, the Chat Completions API implementation is missing the extraction of `reasoning_tokens` from the `usage.completion_tokens_details` object. This means users don't see how many tokens were used for reasoning, which can be significant (thousands of tokens) for these models.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Extend AgentEvent token_usage type to include reasoningTokens field (packages/agent/src/agent.ts:16-23)
|
||||
- [x] Update Chat Completions API token extraction to include reasoning tokens from usage.completion_tokens_details (packages/agent/src/agent.ts:210-220)
|
||||
- [x] Update console renderer to display reasoning tokens in usage metrics (packages/agent/src/renderers/console-renderer.ts:117-121)
|
||||
- [x] Update TUI renderer to display reasoning tokens in usage metrics (packages/agent/src/renderers/tui-renderer.ts:219-227)
|
||||
- [x] Update JSON renderer to include reasoning tokens in output (packages/agent/src/renderers/json-renderer.ts:20)
|
||||
- [x] User test: Run agent with gpt-4o-mini model (or other reasoning model) and verify reasoning token count appears in metrics display
|
||||
- [x] Debug: Fix missing reasoningTokens field in JSON output even when value is 0
|
||||
- [x] Debug: Investigate why o3 model doesn't report reasoning tokens in responses API
|
||||
- [x] Fix: Parse reasoning summaries from gpt-5 models (summary_text vs reasoning_text)
|
||||
- [x] Fix: Only send reasoning parameter for models that support it (o3, gpt-5, etc)
|
||||
- [x] Fix: Better detection of reasoning support - preflight test instead of hardcoded model names
|
||||
- [x] Fix: Add reasoning support detection for Chat Completions API
|
||||
- [x] Fix: Add correct summary parameter value and increase max_output_tokens for preflight check
|
||||
- [x] Investigate: Chat Completions API has reasoning tokens but no thinking events
|
||||
- [x] Debug: Add logging to understand gpt-5 response structure in responses API
|
||||
- [x] Fix: Change reasoning summary from "auto" to "always" to ensure reasoning text is always returned
|
||||
- [x] Fix: Set correct effort levels - "minimal" for responses API, "low" for completions API
|
||||
- [x] Add note to README about Chat Completions API not returning thinking content
|
||||
- [x] Add Gemini API example to README
|
||||
- [x] Verify Gemini thinking token support and update README accordingly
|
||||
- [x] Add special case for Gemini to include extra_body with thinking_config
|
||||
- [x] Add special case for Groq responses API (doesn't support reasoning.summary)
|
||||
- [x] Refactor: Create centralized provider-specific request adjustment function
|
||||
- [x] Refactor: Extract message content parsing into parseReasoningFromMessage() function
|
||||
- [x] Test: Verify Groq reasoning extraction works with refactored code
|
||||
- [x] Test: Verify Gemini thinking extraction works with refactored code
|
||||
|
||||
## Notes
|
||||
User reported that o3 model with responses API doesn't show reasoning tokens or thinking events.
|
||||
Fixed by:
|
||||
1. Adding reasoningTokens field to AgentEvent type
|
||||
2. Extracting reasoning tokens from both Chat Completions and Responses APIs
|
||||
3. Smart preflight detection of reasoning support for both APIs (cached per agent instance)
|
||||
4. Only sending reasoning parameter for supported models
|
||||
5. Parsing both reasoning_text (o1/o3) and summary_text (gpt-5) formats
|
||||
6. Displaying reasoning tokens in console and TUI renderers with ⚡ symbol
|
||||
7. Properly handling reasoning_effort for Chat Completions API
|
||||
8. Set correct effort levels: "minimal" for Responses API, "low" for Chat Completions API
|
||||
9. Set summary to "always" for Responses API
|
||||
|
||||
**Important findings**:
|
||||
- Chat Completions API by design only returns reasoning token *counts* but not the actual thinking/reasoning content for o1 models. This is expected behavior - only the Responses API exposes thinking events.
|
||||
- GPT-5 models currently return empty summary arrays even with `summary: "detailed"` - the model indicates it "can't share step-by-step reasoning". This appears to be a model limitation/behavior rather than a code issue.
|
||||
- The reasoning tokens ARE being used and counted correctly when the model chooses to use them.
|
||||
- With effort="minimal" and summary="detailed", gpt-5 sometimes chooses not to use reasoning at all for simple questions.
|
||||
|
|
@ -1,132 +0,0 @@
|
|||
# TUI Double Buffer Implementation Analysis
|
||||
|
||||
## Current Architecture
|
||||
|
||||
### Core TUI Rendering System
|
||||
- **Location:** `/Users/badlogic/workspaces/pi-mono/packages/tui/src/tui.ts`
|
||||
- **render()** method (lines 107-150): Traverses components, calculates keepLines
|
||||
- **renderToScreen()** method (lines 354-429): Outputs to terminal with differential rendering
|
||||
- **Terminal output:** Single `writeSync()` call at line 422
|
||||
|
||||
### Component Interface
|
||||
```typescript
|
||||
interface ComponentRenderResult {
|
||||
lines: string[];
|
||||
changed: boolean;
|
||||
}
|
||||
|
||||
interface ContainerRenderResult extends ComponentRenderResult {
|
||||
keepLines: number; // Lines from top that are unchanged
|
||||
}
|
||||
```
|
||||
|
||||
### The Flicker Problem
|
||||
|
||||
**Root Cause:**
|
||||
1. LoadingAnimation (`packages/agent/src/renderers/tui-renderer.ts`) updates every 80ms
|
||||
2. Calls `ui.requestRender()` on each frame, marking itself as changed
|
||||
3. Container's `keepLines` logic stops accumulating once any child changes
|
||||
4. All components below animation must re-render completely
|
||||
5. TextEditor always returns `changed: true` for cursor updates
|
||||
|
||||
**Current Differential Rendering:**
|
||||
- Moves cursor up by `(totalLines - keepLines)` lines
|
||||
- Clears everything from cursor down with `\x1b[0J`
|
||||
- Writes all lines after `keepLines` position
|
||||
- Creates visible flicker when large portions re-render
|
||||
|
||||
### Performance Bottlenecks
|
||||
|
||||
1. **TextEditor (`packages/tui/src/text-editor.ts`):**
|
||||
- Always returns `changed: true` (lines 122-125)
|
||||
- Complex `layoutText()` recalculates wrapping every render
|
||||
- Heavy computation for cursor positioning and highlighting
|
||||
|
||||
2. **Animation Cascade Effect:**
|
||||
- Single animated component forces all components below to re-render
|
||||
- Container stops accumulating `keepLines` after first change
|
||||
- No isolation between independent component updates
|
||||
|
||||
3. **Terminal I/O:**
|
||||
- Single large `writeSync()` call for all changing content
|
||||
- Clears and redraws entire sections even for minor changes
|
||||
|
||||
### Existing Optimizations
|
||||
|
||||
**Component Caching:**
|
||||
- TextComponent: Stores `lastRenderedLines[]`, compares arrays
|
||||
- MarkdownComponent: Uses `previousLines[]` comparison
|
||||
- WhitespaceComponent: `firstRender` flag
|
||||
- Components properly detect and report changes
|
||||
|
||||
**Render Batching:**
|
||||
- `requestRender()` uses `process.nextTick()` to batch updates
|
||||
- Prevents multiple renders in same tick
|
||||
|
||||
## Double Buffer Solution
|
||||
|
||||
### Architecture Benefits
|
||||
- Components already return `{lines, changed}` - no interface changes needed
|
||||
- Clean separation between rendering (back buffer) and output (terminal)
|
||||
- Single `writeSync()` location makes implementation straightforward
|
||||
- Existing component caching remains useful
|
||||
|
||||
### Implementation Strategy
|
||||
|
||||
**TuiDoubleBuffer Class:**
|
||||
1. Extend current TUI class
|
||||
2. Maintain front buffer (last rendered lines) and back buffer (new render)
|
||||
3. Override `renderToScreen()` with line-by-line diffing algorithm
|
||||
4. Batch consecutive changed lines to minimize writeSync() calls
|
||||
5. Position cursor only at changed lines, not entire sections
|
||||
|
||||
**Line-Level Diffing Algorithm:**
|
||||
```typescript
|
||||
// Pseudocode
|
||||
for (let i = 0; i < maxLines; i++) {
|
||||
if (frontBuffer[i] !== backBuffer[i]) {
|
||||
// Position cursor at line i
|
||||
// Clear line
|
||||
// Write new content
|
||||
// Or batch with adjacent changes
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Expected Benefits
|
||||
|
||||
1. **Reduced Flicker:**
|
||||
- Only changed lines are redrawn
|
||||
- Animation updates don't affect static content below
|
||||
- TextEditor cursor updates don't require full redraw
|
||||
|
||||
2. **Better Performance:**
|
||||
- Fewer terminal control sequences
|
||||
- Smaller writeSync() payloads
|
||||
- Components can cache aggressively
|
||||
|
||||
3. **Preserved Functionality:**
|
||||
- No changes to existing components
|
||||
- Backward compatible with current TUI class
|
||||
- Can switch between single/double buffer modes
|
||||
|
||||
### Test Plan
|
||||
|
||||
Create comparison tests:
|
||||
1. `packages/tui/test/single-buffer.ts` - Current implementation
|
||||
2. `packages/tui/test/double-buffer.ts` - New implementation
|
||||
3. Both with LoadingAnimation above TextEditor
|
||||
4. Measure render() timing and visual flicker
|
||||
|
||||
### Files to Modify
|
||||
|
||||
**New Files:**
|
||||
- `packages/tui/src/tui-double-buffer.ts` - New TuiDoubleBuffer class
|
||||
|
||||
**Test Files:**
|
||||
- `packages/tui/test/single-buffer.ts` - Test current implementation
|
||||
- `packages/tui/test/double-buffer.ts` - Test new implementation
|
||||
|
||||
**No Changes Needed:**
|
||||
- Component implementations (already support caching and change detection)
|
||||
- Component interfaces (already return required data)
|
||||
|
|
@ -1,92 +0,0 @@
|
|||
# TUI Double Buffer Implementation
|
||||
|
||||
**Status:** Done
|
||||
**Agent PID:** 74014
|
||||
|
||||
## Original Todo
|
||||
- tui: we get tons of flicker in the text editor component. specifically, if we have an animated component above the editor, the editor needs re-rendering completely. Different strategy:
|
||||
- keep a back buffer and front buffer. a buffer is a list of lines.
|
||||
- on Tui.render()
|
||||
- render a new back buffer, top to bottom. components can cache previous render results and return that as a single list of lines if nothing changed
|
||||
- compare the back buffer with the front buffer. for each line that changed
|
||||
- position the cursor at that line
|
||||
- clear the line
|
||||
- render the new line
|
||||
- batch multiple subsequent lines that changed so we do not have tons of writeSync() calls
|
||||
- Open questions:
|
||||
- is this faster and procudes less flicker?
|
||||
- If possible, we should implement this as a new TuiDoubleBuffer class. Existing components should not need changing, as they already report if they changed and report their lines
|
||||
- Testing:
|
||||
- Create a packages/tui/test/single-buffer.ts file: it has a LoadingAnimation like in packages/agent/src/renderers/tui-renderer.ts inside a container as the first child, and a text editor component as the second child, which is focused.
|
||||
- Create a packages/tui/test/double-buffer.ts file: same setup
|
||||
- Measure timing of render() for both
|
||||
|
||||
## Description
|
||||
Implement a double-buffering strategy for the TUI rendering system to eliminate flicker when animated components (like LoadingAnimation) are displayed above interactive components (like TextEditor). The solution will use line-by-line diffing between a front buffer (previous render) and back buffer (current render) to only update changed lines on the terminal, replacing the current section-based differential rendering.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Create TuiDoubleBuffer class extending Container with same interface as TUI (`packages/tui/src/tui-double-buffer.ts`)
|
||||
- [x] Implement line-by-line diffing algorithm in overridden renderToScreen() method
|
||||
- [x] Add batching logic to group consecutive changed lines for efficient terminal writes
|
||||
- [x] Create test file with current single-buffer implementation (`packages/tui/test/single-buffer.ts`)
|
||||
- [x] Create test file with new double-buffer implementation (`packages/tui/test/double-buffer.ts`)
|
||||
- [x] Add timing measurements to both test files to compare performance
|
||||
- [x] Manual test: Run both test files to verify reduced flicker in double-buffer version
|
||||
- [x] Manual test: Verify existing TUI functionality still works with original class
|
||||
- [x] Fix cursor positioning bug in double-buffer implementation (stats appear at top, components don't update)
|
||||
- [x] Add write function parameter to both TUI classes for testability
|
||||
- [x] Create VirtualTerminal class for testing ANSI output
|
||||
- [x] Create verification test that compares both implementations
|
||||
- [x] Redesign double-buffer with proper cursor tracking to fix duplicate content issue
|
||||
- [x] Implement component-based rendering with unique IDs to handle reordering
|
||||
|
||||
## Additional Work Completed
|
||||
|
||||
### Terminal Abstraction & Testing Infrastructure
|
||||
- [x] Created Terminal interface abstracting stdin/stdout operations (`packages/tui/src/terminal.ts`)
|
||||
- [x] Implemented ProcessTerminal for production use with process.stdin/stdout
|
||||
- [x] Implemented VirtualTerminal using @xterm/headless for accurate terminal emulation in tests
|
||||
- [x] Fixed @xterm/headless TypeScript imports (changed from wildcard to proper named imports)
|
||||
- [x] Added test-specific methods to VirtualTerminal (flushAndGetViewport, writeSync)
|
||||
- [x] Updated TUI class to accept Terminal interface via constructor for dependency injection
|
||||
|
||||
### Component Organization
|
||||
- [x] Moved all component files to `packages/tui/src/components/` directory
|
||||
- [x] Updated all imports in index.ts and test files to use new paths
|
||||
|
||||
### Test Suite Updates
|
||||
- [x] Created comprehensive test suite for VirtualTerminal (`packages/tui/test/virtual-terminal.test.ts`)
|
||||
- [x] Updated TUI rendering tests to use async/await pattern for proper render timing
|
||||
- [x] Fixed all test assertions to work with exact output (no trim() allowed per user requirement)
|
||||
- [x] Fixed xterm newline handling (discovered \r\n requirement vs just \n)
|
||||
- [x] Added test for preserving existing terminal content when TUI starts and handles component growth
|
||||
|
||||
### Build Configuration
|
||||
- [x] Updated root tsconfig.json to include test files for type checking
|
||||
- [x] Ensured monorepo-wide type checking covers all source and test files
|
||||
|
||||
### Bug Fixes
|
||||
- [x] Fixed TUI differential rendering bug when components grow in height
|
||||
- Issue: Old content wasn't properly cleared when component line count increased
|
||||
- Solution: Clear each old line individually before redrawing, ensure cursor at line start
|
||||
- This prevents line-wrapping artifacts when the text editor grows (e.g., SHIFT+ENTER adding lines)
|
||||
|
||||
## Notes
|
||||
- Successfully implemented TuiDoubleBuffer class with line-by-line diffing
|
||||
- Complete redesign with proper cursor tracking:
|
||||
- Tracks actual cursor position separately from buffer length
|
||||
- Clear separation between screenBuffer and new render
|
||||
- Removed console.log/stdout.write interceptors per user request
|
||||
- Terminal abstraction enables proper testing without mocking process.stdin/stdout
|
||||
- VirtualTerminal provides accurate terminal emulation using xterm.js
|
||||
- Test results show significant reduction in flicker:
|
||||
- Single-buffer: Uses clear-down (`\x1b[0J`) which clears entire sections
|
||||
- Double-buffer: Uses clear-line (`\x1b[2K`) only for changed lines
|
||||
- Animation updates only affect the animation line, not the editor below
|
||||
- Performance similar between implementations (~0.4-0.6ms per render)
|
||||
- Both TUI and TuiDoubleBuffer maintain the same interface for backward compatibility
|
||||
- Can be used as drop-in replacement: just change `new TUI()` to `new TuiDoubleBuffer()`
|
||||
- All 22 tests passing with proper async handling and exact output matching
|
||||
- Fixed critical rendering bug in TUI's differential rendering for growing components
|
||||
|
|
@ -1,82 +0,0 @@
|
|||
## Analysis of TUI Differential Rendering and Layout Shift Artifacts
|
||||
|
||||
### Key Findings
|
||||
|
||||
**1. The Surgical Differential Rendering Implementation**
|
||||
|
||||
The TUI uses a three-strategy rendering system in `renderDifferentialSurgical` (lines 331-513 in `/Users/badlogic/workspaces/pi-mono/packages/tui/src/tui.ts`):
|
||||
|
||||
- **SURGICAL**: Updates only changed lines with same line counts
|
||||
- **PARTIAL**: Re-renders from first change when structure/line counts shift
|
||||
- **FULL**: Clears scrollback when changes are above viewport
|
||||
|
||||
**2. The Critical Gap: Cursor Positioning in SURGICAL Strategy**
|
||||
|
||||
The artifact issue lies in the SURGICAL strategy's cursor positioning logic (lines 447-493). When components are added and removed dynamically, the cursor positioning calculations become incorrect, leading to incomplete clearing of old content.
|
||||
|
||||
**Specific Problem Areas:**
|
||||
|
||||
```typescript
|
||||
// Lines 484-492: Cursor repositioning after surgical updates
|
||||
const lastContentLine = totalNewLines - 1;
|
||||
const linesToMove = lastContentLine - currentCursorLine;
|
||||
if (linesToMove > 0) {
|
||||
output += `\x1b[${linesToMove}B`;
|
||||
} else if (linesToMove < 0) {
|
||||
output += `\x1b[${-linesToMove}A`;
|
||||
}
|
||||
// Now add final newline to position cursor on next line
|
||||
output += "\r\n";
|
||||
```
|
||||
|
||||
**3. Component Change Detection Issues**
|
||||
|
||||
The system determines changes by comparing:
|
||||
- Component IDs (structural changes)
|
||||
- Line counts (hasLineCountChange)
|
||||
- Content with same line counts (changedLines array)
|
||||
|
||||
However, when a status message is added temporarily then removed, the detection logic may not properly identify all affected lines that need clearing.
|
||||
|
||||
**4. Missing Test Coverage**
|
||||
|
||||
Current tests in `/Users/badlogic/workspaces/pi-mono/packages/tui/test/` don't cover the specific scenario of:
|
||||
- Dynamic addition of components that cause layout shifts
|
||||
- Temporary status messages that appear and disappear
|
||||
- Components moving back to original positions after removals
|
||||
|
||||
### The Agent Scenario Analysis
|
||||
|
||||
The agent likely does this sequence:
|
||||
1. Has header, chat container, text editor in vertical layout
|
||||
2. Adds a status message component between chat and editor
|
||||
3. Editor shifts down (differential render uses PARTIAL strategy)
|
||||
4. After delay, removes status message
|
||||
5. Editor shifts back up (this is where artifacts remain)
|
||||
|
||||
The issue is that when the editor moves back up, the SURGICAL strategy is chosen (same component structure, just content changes), but it doesn't properly clear the old border lines that were drawn when the editor was in the lower position.
|
||||
|
||||
### Root Cause
|
||||
|
||||
The differential rendering assumes that when using SURGICAL updates, only content within existing component boundaries changes. However, when components shift positions due to additions/removals, old rendered content at previous positions isn't being cleared properly.
|
||||
|
||||
**Specific gap:** The SURGICAL strategy clears individual lines with `\x1b[2K` but doesn't account for situations where component positions have changed, leaving artifacts from the previous render at the old positions.
|
||||
|
||||
### Test Creation Recommendation
|
||||
|
||||
A test reproducing this would:
|
||||
|
||||
```typescript
|
||||
test("clears artifacts when components shift positions dynamically", async () => {
|
||||
// 1. Setup: header, container, editor
|
||||
// 2. Add status message (causes editor to shift down)
|
||||
// 3. Remove status message (editor shifts back up)
|
||||
// 4. Verify no border artifacts remain at old editor position
|
||||
});
|
||||
```
|
||||
|
||||
The test should specifically check that after the removal, there are no stray border characters (`╭`, `╮`, `│`, `╰`, `╯`) left at the position where the editor was temporarily located.
|
||||
|
||||
### Proposed Fix Direction
|
||||
|
||||
The PARTIAL strategy should be used more aggressively when components are added/removed, even if the final structure looks identical, to ensure complete clearing of old content. Alternatively, the SURGICAL strategy needs enhanced logic to detect and clear content at previous component positions.
|
||||
|
|
@ -1,36 +0,0 @@
|
|||
# Agent/TUI: Ctrl+C Display Artifact
|
||||
**Status:** Done
|
||||
**Agent PID:** 36116
|
||||
|
||||
## Original Todo
|
||||
agent/tui: when pressing ctrl + c, the editor gets pushed down by one line, after a second it gets pushed up again, leaving an artifact (duplicate bottom border). Should replicate this in a test:
|
||||
Press Ctrl+C again to exit
|
||||
╭────────────────────────────────────────────────────────────────────────────────────────────────────────╮
|
||||
│ > │
|
||||
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
||||
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
|
||||
↑967 ↓12 ⚒ 4
|
||||
|
||||
## Description
|
||||
Create a test in the TUI package that reproduces the rendering artifact issue when components dynamically shift positions (like when a status message appears and disappears). The test will verify that when components move back to their original positions after a temporary layout change, no visual artifacts (duplicate borders) remain. If the test reveals a bug in the TUI's differential rendering, fix it.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
Create a test that reproduces the layout shift artifact issue in the TUI differential rendering, then fix the rendering logic if needed to properly clear old content when components shift positions.
|
||||
|
||||
- [x] Create test file `packages/tui/test/layout-shift-artifacts.test.ts` that reproduces the issue
|
||||
- [x] Test should create components in vertical layout, add a temporary component causing shifts, remove it, and verify no artifacts
|
||||
- [x] Run test to confirm it reproduces the artifact issue
|
||||
- [x] Fix the differential rendering logic in `packages/tui/src/tui.ts` to properly clear content when components shift
|
||||
- [x] Verify all tests pass (including the new one) after fix
|
||||
- [x] Run `npm run check` to ensure code quality
|
||||
|
||||
## Notes
|
||||
The issue was NOT in the differential rendering strategy as initially thought. The real bug was in the Container component:
|
||||
|
||||
When a Container is cleared (has 0 children), it wasn't reporting as "changed" because the render method only checked if any children reported changes. Since there were no children after clearing, `changed` remained false, and the differential renderer didn't know to re-render that area.
|
||||
|
||||
The fix: Container now tracks `previousChildCount` and reports as changed when the number of children changes (especially important for going from N children to 0).
|
||||
|
||||
This ensures that when statusContainer.clear() is called in the agent, the differential renderer properly clears and re-renders that section of the screen.
|
||||
|
|
@ -1,79 +0,0 @@
|
|||
# Analysis: Interrupted Message Not Showing in TUI
|
||||
|
||||
## Problem Summary
|
||||
When pressing ESC to interrupt the agent while it's working, the "interrupted" message is not appearing in the TUI interface.
|
||||
|
||||
## Research Findings
|
||||
|
||||
### Interrupt Handling Flow
|
||||
|
||||
1. **ESC Key Detection** (TuiRenderer line 110)
|
||||
- ESC key is detected as `\x1b` in the `onGlobalKeyPress` handler
|
||||
- Only triggers when `this.currentLoadingAnimation` is active (agent is processing)
|
||||
|
||||
2. **Immediate UI Cleanup** (TuiRenderer lines 112-128)
|
||||
- Calls `this.onInterruptCallback()` (which calls `agent.interrupt()`)
|
||||
- Stops loading animation and clears status container
|
||||
- Re-enables text editor submission
|
||||
- Requests UI render
|
||||
|
||||
3. **Agent Interruption** (Agent.ts line 615-617)
|
||||
- `agent.interrupt()` calls `this.abortController?.abort()`
|
||||
- This triggers AbortSignal in ongoing API calls
|
||||
|
||||
4. **Interrupted Event Generation** (Agent.ts multiple locations)
|
||||
- When signal is aborted, code checks `signal?.aborted`
|
||||
- Emits `{ type: "interrupted" }` event via `eventReceiver?.on()`
|
||||
- Throws `new Error("Interrupted")` to exit processing
|
||||
|
||||
5. **Message Display** (TuiRenderer line 272-283)
|
||||
- Handles `"interrupted"` event
|
||||
- Adds red "[Interrupted by user]" message to chat container
|
||||
- Requests render
|
||||
|
||||
### Root Cause Analysis
|
||||
|
||||
The issue appears to be a **race condition with duplicate cleanup**:
|
||||
|
||||
1. When ESC is pressed, the key handler **immediately** (lines 115-120):
|
||||
- Stops the loading animation
|
||||
- Clears the status container
|
||||
- Sets `currentLoadingAnimation = null`
|
||||
|
||||
2. Later, when the "interrupted" event arrives (lines 273-277), it tries to:
|
||||
- Stop the loading animation again (but it's already null)
|
||||
- Clear the status container again (already cleared)
|
||||
|
||||
3. The comment on line 123 says "Don't show message here - the interrupted event will handle it", but the event handler at line 280 **does** add the message to the chat container.
|
||||
|
||||
### The Actual Problem
|
||||
|
||||
Looking closely at the code flow:
|
||||
|
||||
1. ESC handler clears animation and calls `agent.interrupt()` (synchronous)
|
||||
2. Agent aborts the controller (synchronous)
|
||||
3. API call code detects abort and emits "interrupted" event (asynchronous)
|
||||
4. TUI renderer receives "interrupted" event and adds message (asynchronous)
|
||||
|
||||
The issue is likely that:
|
||||
- The interrupted event IS being emitted and handled
|
||||
- The message IS being added to the chat container
|
||||
- But the UI render might not be properly triggered or the differential rendering isn't detecting the change
|
||||
|
||||
### Additional Issues Found
|
||||
|
||||
1. **Duplicate Animation Cleanup**: The loading animation is stopped twice - once in the ESC handler and once in the interrupted event handler. This is redundant but shouldn't cause the missing message.
|
||||
|
||||
2. **Render Request Timing**: The ESC handler requests a render immediately after clearing the UI, then the interrupted event handler adds the message but doesn't explicitly request another render (it relies on the Container's automatic render request).
|
||||
|
||||
3. **Container Change Detection**: Recent commit 192d8d2 fixed container change detection issues. The interrupted message addition might not be triggering proper change detection.
|
||||
|
||||
## Solution Approach
|
||||
|
||||
The fix needs to ensure the interrupted message is properly displayed. Options:
|
||||
|
||||
1. **Add explicit render request** after adding the interrupted message
|
||||
2. **Remove duplicate cleanup** in the ESC handler and let the event handler do all the work
|
||||
3. **Ensure proper change detection** when adding the message to the chat container
|
||||
|
||||
The cleanest solution is likely option 2 - let the interrupted event handler do all the UI updates to avoid race conditions and ensure proper sequencing.
|
||||
|
|
@ -1,24 +0,0 @@
|
|||
# Fix interrupted message not showing when ESC pressed in agent TUI
|
||||
|
||||
**Status:** Done
|
||||
**Agent PID:** 47968
|
||||
|
||||
## Original Todo
|
||||
agent/tui: not seeing a read "interrupted" mesage anymore if i press ESC while agnet works
|
||||
|
||||
## Description
|
||||
Fix the issue where the "interrupted" message is not displayed in the TUI when pressing ESC to interrupt the agent while it's processing. The root cause is duplicate UI cleanup in the ESC key handler that interferes with the asynchronous interrupted event handler.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
Remove duplicate UI cleanup from ESC key handler and ensure interrupted event handler properly displays the message:
|
||||
- [x] Remove duplicate loading animation cleanup from ESC key handler in tui-renderer.ts (lines 115-120)
|
||||
- [x] Add explicit render request after adding interrupted message (line 280)
|
||||
- [x] Fix core issue: Emit "interrupted" event when API call is aborted (agent.ts line 606-607)
|
||||
- [x] Pass abort signal to preflight reasoning check
|
||||
- [x] Test interruption during API call (e.g., "write a poem")
|
||||
- [x] Verify "[Interrupted by user]" message appears and UI is restored
|
||||
|
||||
## Notes
|
||||
[Implementation notes]
|
||||
|
|
@ -1,93 +0,0 @@
|
|||
# TUI Garbled Output Analysis
|
||||
|
||||
## Problem Description
|
||||
When reading multiple README.md files and then sending a new message, the TUI displays garbled output. This happens for both renderDifferential and renderDifferentialSurgical methods, affecting any model (not just gpt-5).
|
||||
|
||||
## Rendering System Overview
|
||||
|
||||
### Three Rendering Strategies
|
||||
1. **SURGICAL Updates** - Updates only changed lines (1-2 lines typical)
|
||||
2. **PARTIAL Re-render** - Clears from first change to end, re-renders tail
|
||||
3. **FULL Re-render** - Clears scrollback and screen, renders everything
|
||||
|
||||
### Key Components
|
||||
- **TUI Class** (`packages/tui/src/tui.ts`): Main rendering engine
|
||||
- **Container Class**: Manages child components, auto-triggers re-renders
|
||||
- **TuiRenderer** (`packages/agent/src/renderers/tui-renderer.ts`): Agent's TUI integration
|
||||
- **Event System**: Event-driven updates through AgentEvent
|
||||
|
||||
## Root Causes Identified
|
||||
|
||||
### 1. Complex ANSI Code Handling
|
||||
- MarkdownComponent line wrapping has issues with ANSI escape sequences
|
||||
- Code comment at line 203: "Need to wrap - this is complex with ANSI codes"
|
||||
- ANSI codes can be split across render operations, causing corruption
|
||||
|
||||
### 2. Race Conditions in Rapid Updates
|
||||
When processing multiple tool calls:
|
||||
- Multiple containers change simultaneously
|
||||
- Content added both above and within viewport
|
||||
- Surgical renderer handles structural changes while maintaining cursor position
|
||||
- Heavy ANSI content (colored tool output, markdown) increases complexity
|
||||
|
||||
### 3. Cursor Position Miscalculation
|
||||
- Rapid updates can cause cursor positioning logic errors
|
||||
- Content shifts due to previous renders not properly accounted for
|
||||
- Viewport vs scrollback buffer calculations can become incorrect
|
||||
|
||||
### 4. Container Change Detection Timing
|
||||
- Recent fix (192d8d2) addressed container clear detection
|
||||
- But rapid component addition/removal still may leave artifacts
|
||||
- Multiple render requests debounced but may miss intermediate states
|
||||
|
||||
## Specific Scenario Analysis
|
||||
|
||||
### Sequence When Issue Occurs:
|
||||
1. User sends "read all README.md files"
|
||||
2. Multiple tool calls execute rapidly:
|
||||
- glob() finds files
|
||||
- Multiple read() calls for each README
|
||||
3. Long file contents displayed with markdown formatting
|
||||
4. User sends new message while output still rendering
|
||||
5. New components added while previous render incomplete
|
||||
|
||||
### Visual Artifacts Observed:
|
||||
- Text overlapping from different messages
|
||||
- Partial ANSI codes causing color bleeding
|
||||
- Editor borders duplicated or misaligned
|
||||
- Content from previous render persisting
|
||||
- Line wrapping breaking mid-word with styling
|
||||
|
||||
## Related Fixes
|
||||
- Commit 1d9b772: Fixed ESC interrupt handling race conditions
|
||||
- Commit 192d8d2: Fixed container change detection for clear operations
|
||||
- Commit 2ec8a27: Added instructional header to chat demo
|
||||
|
||||
## Test Coverage Gaps
|
||||
- No tests for rapid multi-tool execution scenarios
|
||||
- Missing tests for ANSI code handling across line wraps
|
||||
- No stress tests for viewport overflow with rapid updates
|
||||
- Layout shift artifacts test exists but limited scope
|
||||
|
||||
## Recommended Solutions
|
||||
|
||||
### 1. Improve ANSI Handling
|
||||
- Fix MarkdownComponent line wrapping to preserve ANSI codes
|
||||
- Ensure escape sequences never split across operations
|
||||
- Add ANSI-aware string measurement utilities
|
||||
|
||||
### 2. Add Render Queuing
|
||||
- Implement render operation queue to prevent overlaps
|
||||
- Ensure each render completes before next begins
|
||||
- Add render state tracking
|
||||
|
||||
### 3. Enhanced Change Detection
|
||||
- Track render generation/version numbers
|
||||
- Validate cursor position before surgical updates
|
||||
- Add checksums for rendered content verification
|
||||
|
||||
### 4. Comprehensive Testing
|
||||
- Create test simulating exact failure scenario
|
||||
- Add stress tests with rapid multi-component updates
|
||||
- Test ANSI-heavy content with line wrapping
|
||||
- Verify viewport calculations under load
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
# Fix TUI Garbled Output When Sending Multiple Messages
|
||||
|
||||
**Status:** InProgress
|
||||
**Agent PID:** 54802
|
||||
|
||||
## Original Todo
|
||||
agent/tui: "read all README.md files except in node_modules". wait for completion, then send a new message. Getting garbled output. this happens for both of the renderDifferential and renderDifferentialSurgical methods. We need to emulate this in a test and get to the bottom of it.
|
||||
|
||||
## Description
|
||||
Fix the TUI rendering corruption that occurs when sending multiple messages in rapid succession, particularly after tool calls that produce large outputs. The issue manifests as garbled/overlapping text when new messages are sent while previous output is still being displayed.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
[how we are building it]
|
||||
- [x] Create test to reproduce the issue: Simulate rapid tool calls with large outputs followed by new message
|
||||
- [x] Fix ANSI code handling in MarkdownComponent line wrapping (packages/tui/src/components/markdown-component.ts:203-276)
|
||||
- [x] Implement new line-based rendering strategy that properly handles scrollback and viewport boundaries
|
||||
- [x] Add comprehensive test coverage for multi-message scenarios
|
||||
- [ ] User test: Run agent, execute "read all README.md files", wait for completion, send new message, verify no garbled output
|
||||
|
||||
## Notes
|
||||
- Successfully reproduced the issue with test showing garbled text overlay
|
||||
- Fixed ANSI code handling in MarkdownComponent line wrapping
|
||||
- Root cause: PARTIAL rendering strategy incorrectly calculated cursor position when content exceeded viewport
|
||||
- When content is in scrollback, cursor can't reach it (can only move within viewport)
|
||||
- Old PARTIAL strategy tried to move cursor 33 lines up when only 30 were possible
|
||||
- This caused cursor to land at wrong position (top of viewport instead of target line in scrollback)
|
||||
- Solution: Implemented new `renderLineBased` method that:
|
||||
- Compares old and new lines directly (component-agnostic)
|
||||
- Detects if changes are in scrollback (unreachable) or viewport
|
||||
- For scrollback changes: does full clear and re-render
|
||||
- For viewport changes: moves cursor correctly within viewport bounds and updates efficiently
|
||||
- Handles surgical line-by-line updates when possible for minimal redraws
|
||||
- Test now passes - no more garbled output when messages exceed viewport!
|
||||
|
|
@ -1,161 +0,0 @@
|
|||
# Token Usage Tracking Analysis - pi-agent Codebase
|
||||
|
||||
## 1. Token Usage Event Structure and Flow
|
||||
|
||||
### Per-Request vs Cumulative Analysis
|
||||
|
||||
After reading `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts` in full, I can confirm that **token usage events are per-request, NOT cumulative**.
|
||||
|
||||
**Evidence:**
|
||||
- Lines 296-308 in `callModelResponsesApi()`: Token usage is reported directly from API response usage object
|
||||
- Lines 435-447 in `callModelChatCompletionsApi()`: Token usage is reported directly from API response usage object
|
||||
- The token counts represent what was used for that specific LLM request only
|
||||
|
||||
### TokenUsageEvent Definition
|
||||
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts:16-24`
|
||||
|
||||
```typescript
|
||||
{
|
||||
type: "token_usage";
|
||||
inputTokens: number;
|
||||
outputTokens: number;
|
||||
totalTokens: number;
|
||||
cacheReadTokens: number;
|
||||
cacheWriteTokens: number;
|
||||
reasoningTokens: number;
|
||||
}
|
||||
```
|
||||
|
||||
## 2. Current Token Usage Display Implementation
|
||||
|
||||
### TUI Renderer
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/tui-renderer.ts`
|
||||
|
||||
**Current Behavior:**
|
||||
- Lines 60-66: Stores "last" token values (not cumulative)
|
||||
- Lines 251-259: Updates token counts on `token_usage` events
|
||||
- Lines 280-311: Displays current request tokens in `updateTokenDisplay()`
|
||||
- Format: `↑{input} ↓{output} ⚡{reasoning} ⟲{cache_read} ⟳{cache_write} ⚒ {tool_calls}`
|
||||
|
||||
**Comment on line 252:** "Store the latest token counts (not cumulative since prompt includes full context)"
|
||||
|
||||
### Console Renderer
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/console-renderer.ts`
|
||||
|
||||
**Current Behavior:**
|
||||
- Lines 11-16: Stores "last" token values
|
||||
- Lines 165-172: Updates token counts on `token_usage` events
|
||||
- Lines 52-82: Displays tokens after each assistant message
|
||||
|
||||
## 3. Session Storage
|
||||
|
||||
### SessionManager
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/session-manager.ts`
|
||||
|
||||
**Current Implementation:**
|
||||
- Lines 138-146: Has a `totalUsage` field in `SessionData` interface
|
||||
- Lines 158-160: **BUG**: Only stores the LAST token_usage event, not cumulative totals
|
||||
- This should accumulate all token usage across the session
|
||||
|
||||
## 4. Slash Command Infrastructure
|
||||
|
||||
### Existing Slash Command Support
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/tui/src/autocomplete.ts`
|
||||
|
||||
**Available Infrastructure:**
|
||||
- `SlashCommand` interface with `name`, `description`, optional `getArgumentCompletions`
|
||||
- `CombinedAutocompleteProvider` handles slash command detection and completion
|
||||
- Text editor auto-triggers on "/" at start of line
|
||||
|
||||
### Current Usage in TUI Renderer
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/tui-renderer.ts:75-80`
|
||||
|
||||
```typescript
|
||||
const autocompleteProvider = new CombinedAutocompleteProvider(
|
||||
[], // <-- Empty command array!
|
||||
process.cwd(),
|
||||
);
|
||||
```
|
||||
|
||||
**No slash commands are currently implemented in the agent TUI!**
|
||||
|
||||
### Example Implementation
|
||||
**Reference:** `/Users/badlogic/workspaces/pi-mono/packages/tui/test/chat-app.ts:25-60`
|
||||
|
||||
Shows how to:
|
||||
1. Define slash commands with `CombinedAutocompleteProvider`
|
||||
2. Handle slash command execution in `editor.onSubmit`
|
||||
3. Add responses to chat container
|
||||
|
||||
## 5. Implementation Requirements for /tokens Command
|
||||
|
||||
### What Needs to Change
|
||||
|
||||
1. **Add Cumulative Token Tracking to TUI Renderer**
|
||||
- Add cumulative token counters alongside current "last" counters
|
||||
- Update cumulative totals on each `token_usage` event
|
||||
|
||||
2. **Add /tokens Slash Command**
|
||||
- Add to `CombinedAutocompleteProvider` in tui-renderer.ts
|
||||
- Handle in `editor.onSubmit` callback
|
||||
- Display formatted token summary as `TextComponent` in chat container
|
||||
|
||||
3. **Fix SessionManager Bug**
|
||||
- Change `totalUsage` calculation to accumulate all token_usage events
|
||||
- This will enable session-wide token tracking
|
||||
|
||||
4. **Message Handling in TUI**
|
||||
- Need to capture user input before it goes to agent
|
||||
- Check if it's a slash command vs regular message
|
||||
- Route accordingly
|
||||
|
||||
### Current User Input Flow
|
||||
**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/main.ts:190-198`
|
||||
|
||||
```typescript
|
||||
while (true) {
|
||||
const userInput = await renderer.getUserInput();
|
||||
try {
|
||||
await agent.ask(userInput); // All input goes to agent
|
||||
} catch (e: any) {
|
||||
await renderer.on({ type: "error", message: e.message });
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Problem:** All user input goes directly to the agent - no interception for slash commands!
|
||||
|
||||
### Required Architecture Change
|
||||
|
||||
Need to modify the TUI interactive loop to:
|
||||
1. Check if user input starts with "/"
|
||||
2. If slash command: handle locally in renderer
|
||||
3. If regular message: pass to agent as before
|
||||
|
||||
## 6. Token Display Format Recommendations
|
||||
|
||||
Based on existing format patterns, the `/tokens` command should display:
|
||||
|
||||
```
|
||||
Session Token Usage:
|
||||
↑ 1,234 input tokens
|
||||
↓ 5,678 output tokens
|
||||
⚡ 2,345 reasoning tokens
|
||||
⟲ 890 cache read tokens
|
||||
⟳ 123 cache write tokens
|
||||
📊 12,270 total tokens
|
||||
⚒ 5 tool calls
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
The current implementation tracks per-request token usage only. To add cumulative token tracking with a `/tokens` command, we need to:
|
||||
|
||||
1. **Fix SessionManager** to properly accumulate token usage
|
||||
2. **Add cumulative tracking** to TUI renderer
|
||||
3. **Implement slash command infrastructure** in the agent (currently missing)
|
||||
4. **Modify user input handling** to intercept slash commands before they reach the agent
|
||||
5. **Add /tokens command** that displays formatted cumulative statistics
|
||||
|
||||
The TUI framework already supports slash commands, but the agent TUI renderer doesn't use them yet.
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
# Add Token Usage Tracking Command
|
||||
**Status:** Done
|
||||
**Agent PID:** 71159
|
||||
|
||||
## Original Todo
|
||||
- agent: we get token_usage events. the last we get tells us how many input/output/cache read/cache write/reasoning tokens where used for the last request to the LLM endpoint. We want to:
|
||||
- have a /tokens command that outputs the accumulative counts, can just add it to the chat messages container as a nicely formatted TextComponent
|
||||
- means the tui-renderer needs to keep track of accumulative stats as well, not just last request stats.
|
||||
- please check agent.ts (read in full) to see if token_usage is actually some form of accumulative thing, or a per request to llm thing. want to undersatnd what we get.
|
||||
|
||||
## Description
|
||||
Add a `/tokens` slash command to the TUI that displays cumulative token usage statistics for the current session. This includes fixing the SessionManager to properly accumulate token usage and implementing slash command infrastructure in the agent's TUI renderer.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Fix SessionManager to accumulate token usage instead of storing only the last event (packages/agent/src/session-manager.ts:158-160)
|
||||
- [x] Add cumulative token tracking properties to TUI renderer (packages/agent/src/renderers/tui-renderer.ts:60-66)
|
||||
- [x] Add /tokens slash command to CombinedAutocompleteProvider (packages/agent/src/renderers/tui-renderer.ts:75-80)
|
||||
- [x] Modify TUI renderer's onSubmit to handle slash commands locally (packages/agent/src/renderers/tui-renderer.ts:159-177)
|
||||
- [x] Implement /tokens command handler that displays formatted cumulative statistics
|
||||
- [x] Update token_usage event handler to accumulate totals (packages/agent/src/renderers/tui-renderer.ts:275-291)
|
||||
- [x] Test: Verify /tokens command displays correct cumulative totals
|
||||
- [x] Test: Send multiple messages and confirm accumulation works correctly
|
||||
- [x] Fix file autocompletion that was broken by slash command implementation
|
||||
|
||||
## Notes
|
||||
[Implementation notes]
|
||||
|
|
@ -1,606 +0,0 @@
|
|||
# Analysis: Creating Unified AI Package
|
||||
|
||||
## Package Structure Analysis for Pi Monorepo
|
||||
|
||||
Based on my examination of the existing packages (`tui`, `agent`, and `pods`), here are the comprehensive patterns and conventions used in this monorepo:
|
||||
|
||||
### 1. Package Naming Conventions
|
||||
|
||||
**Scoped NPM packages with consistent naming:**
|
||||
- All packages use the `@mariozechner/` scope
|
||||
- Package names follow the pattern: `@mariozechner/pi-<package-name>`
|
||||
- Special case: the main CLI package is simply `@mariozechner/pi` (not `pi-pods`)
|
||||
|
||||
**Directory structure:**
|
||||
- Packages are located in `/packages/<package-name>/`
|
||||
- Directory names match the suffix of the npm package name (e.g., `tui`, `agent`, `pods`)
|
||||
|
||||
### 2. Package.json Structure Patterns
|
||||
|
||||
**Common fields across all packages:**
|
||||
```json
|
||||
{
|
||||
"name": "@mariozechner/pi-<name>",
|
||||
"version": "0.5.8", // Lockstep versioning - all packages share same version
|
||||
"description": "...",
|
||||
"type": "module", // All packages use ES modules
|
||||
"author": "Mario Zechner",
|
||||
"license": "MIT",
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "git+https://github.com/badlogic/pi-mono.git",
|
||||
"directory": "packages/<name>"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20.0.0" // Consistent Node.js requirement
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Binary packages (agent, pods):**
|
||||
- Include `"bin"` field with CLI command mapping
|
||||
- Examples: `"pi-agent": "dist/cli.js"` and `"pi": "dist/cli.js"`
|
||||
|
||||
**Library packages (tui):**
|
||||
- Include `"main"` field pointing to built entry point
|
||||
- Include `"types"` field for TypeScript definitions
|
||||
|
||||
### 3. Scripts Configuration
|
||||
|
||||
**Universal scripts across all packages:**
|
||||
- `"clean": "rm -rf dist"` - Removes build artifacts
|
||||
- `"build": "tsc -p tsconfig.build.json"` - Builds with dedicated build config
|
||||
- `"check": "biome check --write ."` - Linting and formatting
|
||||
- `"prepublishOnly": "npm run clean && npm run build"` - Pre-publish cleanup
|
||||
|
||||
**CLI-specific build scripts:**
|
||||
- Add `&& chmod +x dist/cli.js` for executable permissions
|
||||
- Copy additional assets (e.g., `&& cp src/models.json dist/` for pods package)
|
||||
|
||||
### 4. Dependencies Structure
|
||||
|
||||
**Dependency hierarchy follows a clear pattern:**
|
||||
```
|
||||
pi-tui (foundation) -> pi-agent (uses tui) -> pi (uses agent)
|
||||
```
|
||||
|
||||
**Internal dependencies:**
|
||||
- Use exact version matching for internal packages (e.g., `"^0.5.8"`)
|
||||
- Agent depends on TUI: `"@mariozechner/pi-tui": "^0.5.8"`
|
||||
- Pods depends on Agent: `"@mariozechner/pi-agent": "^0.5.8"`
|
||||
|
||||
**External dependencies:**
|
||||
- Common dependencies like `chalk` are used across multiple packages
|
||||
- Specialized dependencies are package-specific (e.g., `marked` for tui, `openai` for agent)
|
||||
|
||||
### 5. TypeScript Configuration
|
||||
|
||||
**Dual TypeScript configuration approach:**
|
||||
|
||||
**`tsconfig.build.json` (for production builds):**
|
||||
```json
|
||||
{
|
||||
"extends": "../../tsconfig.base.json",
|
||||
"compilerOptions": {
|
||||
"outDir": "./dist",
|
||||
"rootDir": "./src"
|
||||
},
|
||||
"include": ["src/**/*"],
|
||||
"exclude": ["node_modules", "dist"]
|
||||
}
|
||||
```
|
||||
|
||||
**Root `tsconfig.json` (for development and type checking):**
|
||||
- Contains path mappings for cross-package imports during development
|
||||
- Includes all source and test files
|
||||
- Uses `"noEmit": true` for type checking without building
|
||||
|
||||
### 6. Source Directory Structure
|
||||
|
||||
**Standard structure across all packages:**
|
||||
```
|
||||
src/
|
||||
├── index.ts # Main export file
|
||||
├── cli.ts # CLI entry point (if applicable)
|
||||
├── <core-files>.ts # Core functionality
|
||||
├── components/ # Components (for tui)
|
||||
├── tools/ # Tool implementations (for agent)
|
||||
├── commands/ # Command implementations (for pods)
|
||||
└── renderers/ # Output renderers (for agent)
|
||||
```
|
||||
|
||||
### 7. Export Patterns (index.ts)
|
||||
|
||||
**Comprehensive type and function exports:**
|
||||
- Export both types and implementation classes
|
||||
- Use `export type` for type-only exports
|
||||
- Group exports logically with comments
|
||||
- Example from tui: exports components, interfaces, and utilities
|
||||
- Example from agent: exports core classes, types, and utilities
|
||||
|
||||
### 8. Files Configuration
|
||||
|
||||
**Files included in NPM packages:**
|
||||
- `"files": ["dist"]` or `"files": ["dist/**/*", "README.md"]`
|
||||
- All packages include built `dist/` directory
|
||||
- Some include additional files like README.md or scripts
|
||||
|
||||
### 9. README.md Structure
|
||||
|
||||
**Comprehensive documentation pattern:**
|
||||
- Feature overview with key capabilities
|
||||
- Quick start section with code examples
|
||||
- Detailed API documentation
|
||||
- Installation instructions
|
||||
- Development setup
|
||||
- Testing information (especially for tui)
|
||||
- Examples and usage patterns
|
||||
|
||||
### 10. Testing Structure (TUI package)
|
||||
|
||||
**Dedicated test directory:**
|
||||
- `test/` directory with `.test.ts` files for unit tests
|
||||
- Example applications (e.g., `chat-app.ts`, `file-browser.ts`)
|
||||
- Custom testing infrastructure (e.g., `virtual-terminal.ts`)
|
||||
- Test script: `"test": "node --test --import tsx test/*.test.ts"`
|
||||
|
||||
### 11. Version Management
|
||||
|
||||
**Lockstep versioning:**
|
||||
- All packages share the same version number
|
||||
- Root package.json scripts handle version bumping across all packages
|
||||
- Version sync script ensures internal dependency versions match
|
||||
|
||||
### 12. Build Order
|
||||
|
||||
**Dependency-aware build order:**
|
||||
- Root build script builds packages in dependency order
|
||||
- `"build": "npm run build -w @mariozechner/pi-tui && npm run build -w @mariozechner/pi-agent && npm run build -w @mariozechner/pi"`
|
||||
|
||||
### 13. Common Configuration Files
|
||||
|
||||
**Shared across monorepo:**
|
||||
- `biome.json` - Unified linting and formatting configuration
|
||||
- `tsconfig.base.json` - Base TypeScript configuration
|
||||
- `.gitignore` - Ignores `dist/`, `node_modules/`, and other build artifacts
|
||||
- Husky pre-commit hooks for formatting and type checking
|
||||
|
||||
### 14. Keywords and Metadata
|
||||
|
||||
**Descriptive keywords for NPM discovery:**
|
||||
- Each package includes relevant keywords (e.g., "tui", "terminal", "agent", "ai", "llm")
|
||||
- Keywords help with package discoverability
|
||||
|
||||
This analysis shows a well-structured monorepo with consistent patterns that would make adding new packages straightforward by following these established conventions.
|
||||
|
||||
## Monorepo Configuration Analysis
|
||||
|
||||
Based on my analysis of the pi-mono monorepo configuration, here's a comprehensive guide on how to properly integrate a new package:
|
||||
|
||||
### 1. Root Package.json Configuration
|
||||
|
||||
**Workspace Configuration:**
|
||||
- Uses npm workspaces with `"workspaces": ["packages/*"]`
|
||||
- All packages are located under `/packages/` directory
|
||||
- Private monorepo (`"private": true`) with ESM modules (`"type": "module"`)
|
||||
|
||||
**Build System:**
|
||||
- **Sequential Build Order**: The build script explicitly defines dependency order:
|
||||
```json
|
||||
"build": "npm run build -w @mariozechner/pi-tui && npm run build -w @mariozechner/pi-agent && npm run build -w @mariozechner/pi"
|
||||
```
|
||||
- **Dependency Chain**: `pi-tui` → `pi-agent` → `pi` (pods)
|
||||
- **Important**: New packages must be inserted in the correct dependency order in the build script
|
||||
|
||||
**Scripts Available:**
|
||||
- `clean`: Cleans all package dist folders
|
||||
- `build`: Sequential build respecting dependencies
|
||||
- `check`: Runs Biome formatting, package checks, and TypeScript checking
|
||||
- `test`: Runs tests across all packages
|
||||
- Version management scripts (lockstep versioning)
|
||||
- Publishing scripts with dry-run capability
|
||||
|
||||
### 2. Root TypeScript Configuration
|
||||
|
||||
**Dual Configuration System:**
|
||||
- **`tsconfig.base.json`**: Base TypeScript settings for all packages
|
||||
- **`tsconfig.json`**: Development configuration with path mappings for cross-package imports
|
||||
- **Package `tsconfig.build.json`**: Clean build configs per package
|
||||
|
||||
**Path Mappings** (in `/Users/badlogic/workspaces/pi-mono/tsconfig.json`):
|
||||
```json
|
||||
"paths": {
|
||||
"@mariozechner/pi-tui": ["./packages/tui/src/index.ts"],
|
||||
"@mariozechner/pi-agent": ["./packages/agent/src/index.ts"],
|
||||
"@mariozechner/pi": ["./packages/pods/src/index.ts"]
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Package Dependencies and Structure
|
||||
|
||||
**Dependency Structure:**
|
||||
- `pi-tui` (base library) - no internal dependencies
|
||||
- `pi-agent` depends on `pi-tui`
|
||||
- `pi` (pods) depends on `pi-agent`
|
||||
|
||||
**Standard Package Structure:**
|
||||
```
|
||||
packages/new-package/
|
||||
├── src/
|
||||
│ ├── index.ts # Main export file
|
||||
│ └── ... # Implementation files
|
||||
├── package.json # Package configuration
|
||||
├── tsconfig.build.json # Build-specific TypeScript config
|
||||
├── README.md # Package documentation
|
||||
└── dist/ # Build output (gitignored)
|
||||
```
|
||||
|
||||
### 4. Version Management
|
||||
|
||||
**Lockstep Versioning:**
|
||||
- All packages share the same version number (currently 0.5.8)
|
||||
- Automated version sync script: `/Users/badlogic/workspaces/pi-mono/scripts/sync-versions.js`
|
||||
- Inter-package dependencies are automatically updated to match current versions
|
||||
|
||||
**Version Scripts:**
|
||||
- `npm run version:patch/minor/major` - Updates all package versions and syncs dependencies
|
||||
- Automatic dependency version synchronization
|
||||
|
||||
### 5. GitIgnore Patterns
|
||||
|
||||
**Package-Level Ignores:**
|
||||
```
|
||||
packages/*/node_modules/
|
||||
packages/*/dist/
|
||||
```
|
||||
Plus standard ignores for logs, IDE files, environment files, etc.
|
||||
|
||||
## How to Integrate a New Package
|
||||
|
||||
### Step 1: Create Package Structure
|
||||
```bash
|
||||
mkdir packages/your-new-package
|
||||
cd packages/your-new-package
|
||||
```
|
||||
|
||||
### Step 2: Create package.json
|
||||
```json
|
||||
{
|
||||
"name": "@mariozechner/your-new-package",
|
||||
"version": "0.5.8",
|
||||
"description": "Your package description",
|
||||
"type": "module",
|
||||
"main": "./dist/index.js",
|
||||
"types": "./dist/index.d.ts",
|
||||
"files": ["dist"],
|
||||
"scripts": {
|
||||
"clean": "rm -rf dist",
|
||||
"build": "tsc -p tsconfig.build.json",
|
||||
"check": "biome check --write .",
|
||||
"prepublishOnly": "npm run clean && npm run build"
|
||||
},
|
||||
"dependencies": {
|
||||
// Add dependencies on other packages in the monorepo if needed
|
||||
// "@mariozechner/pi-tui": "^0.5.8"
|
||||
},
|
||||
"devDependencies": {},
|
||||
"keywords": ["relevant", "keywords"],
|
||||
"author": "Mario Zechner",
|
||||
"license": "MIT",
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "git+https://github.com/badlogic/pi-mono.git",
|
||||
"directory": "packages/your-new-package"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20.0.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Step 3: Create tsconfig.build.json
|
||||
```json
|
||||
{
|
||||
"extends": "../../tsconfig.base.json",
|
||||
"compilerOptions": {
|
||||
"outDir": "./dist",
|
||||
"rootDir": "./src"
|
||||
},
|
||||
"include": ["src/**/*"],
|
||||
"exclude": ["node_modules", "dist"]
|
||||
}
|
||||
```
|
||||
|
||||
### Step 4: Create src/index.ts
|
||||
```typescript
|
||||
// Main exports for your package
|
||||
export * from './your-main-module.js';
|
||||
```
|
||||
|
||||
### Step 5: Update Root Configuration
|
||||
|
||||
**Add to `/Users/badlogic/workspaces/pi-mono/tsconfig.json` paths:**
|
||||
```json
|
||||
"paths": {
|
||||
"@mariozechner/pi-tui": ["./packages/tui/src/index.ts"],
|
||||
"@mariozechner/pi-agent": ["./packages/agent/src/index.ts"],
|
||||
"@mariozechner/pi": ["./packages/pods/src/index.ts"],
|
||||
"@mariozechner/your-new-package": ["./packages/your-new-package/src/index.ts"]
|
||||
}
|
||||
```
|
||||
|
||||
**Update build script in root `/Users/badlogic/workspaces/pi-mono/package.json`:**
|
||||
```json
|
||||
"build": "npm run build -w @mariozechner/pi-tui && npm run build -w @mariozechner/pi-agent && npm run build -w @mariozechner/your-new-package && npm run build -w @mariozechner/pi"
|
||||
```
|
||||
(Insert in correct dependency order)
|
||||
|
||||
### Step 6: Update sync-versions.js
|
||||
If your package depends on other monorepo packages, add synchronization logic to `/Users/badlogic/workspaces/pi-mono/scripts/sync-versions.js`.
|
||||
|
||||
### Step 7: Install and Test
|
||||
```bash
|
||||
# From monorepo root
|
||||
npm install
|
||||
npm run build
|
||||
npm run check
|
||||
```
|
||||
|
||||
## Key Requirements for New Packages
|
||||
|
||||
1. **Must use ESM modules** (`"type": "module"`)
|
||||
2. **Must follow lockstep versioning** (same version as other packages)
|
||||
3. **Must be placed in correct build order** based on dependencies
|
||||
4. **Must use tab indentation** (Biome config: `"indentStyle": "tab"`)
|
||||
5. **Must avoid `any` types** unless absolutely necessary (project instruction)
|
||||
6. **Must include proper TypeScript declarations** (`"declaration": true`)
|
||||
7. **Must use Node.js >= 20.0.0** (engine requirement)
|
||||
8. **Must follow the standard package structure** with src/, dist/, proper exports
|
||||
|
||||
## Development Workflow
|
||||
|
||||
1. **Development**: Use `tsx` to run source files directly (no build needed)
|
||||
2. **Type Checking**: `npm run check` works across all packages
|
||||
3. **Building**: Sequential builds respect dependency order
|
||||
4. **Publishing**: Automatic version sync and cross-package dependency updates
|
||||
5. **Testing**: Each package can have its own test suite
|
||||
|
||||
This monorepo is well-structured for maintaining multiple related packages with clean dependency management and automated version synchronization.
|
||||
|
||||
## Detailed Findings: Unified AI API Requirements Based on Current pi-agent Usage
|
||||
|
||||
After thoroughly analyzing the existing agent package (`/Users/badlogic/workspaces/pi-mono/packages/agent`), here are the comprehensive requirements for a unified AI API based on current usage patterns:
|
||||
|
||||
### **1. Core API Structure & Event System**
|
||||
|
||||
**Current Pattern:**
|
||||
- Event-driven architecture using `AgentEvent` types
|
||||
- Single `AgentEventReceiver` interface for all output handling
|
||||
- Support for both single-shot and interactive modes
|
||||
|
||||
**Required API Features:**
|
||||
```typescript
|
||||
type AgentEvent =
|
||||
| { type: "session_start"; sessionId: string; model: string; api: string; baseURL: string; systemPrompt: string }
|
||||
| { type: "assistant_start" }
|
||||
| { type: "reasoning"; text: string }
|
||||
| { type: "tool_call"; toolCallId: string; name: string; args: string }
|
||||
| { type: "tool_result"; toolCallId: string; result: string; isError: boolean }
|
||||
| { type: "assistant_message"; text: string }
|
||||
| { type: "error"; message: string }
|
||||
| { type: "user_message"; text: string }
|
||||
| { type: "interrupted" }
|
||||
| { type: "token_usage"; inputTokens: number; outputTokens: number; totalTokens: number; cacheReadTokens: number; cacheWriteTokens: number; reasoningTokens: number }
|
||||
```
|
||||
|
||||
### **2. OpenAI API Integration Patterns**
|
||||
|
||||
**Current Implementation:**
|
||||
- Uses OpenAI SDK v5.12.2 (`import OpenAI from "openai"`)
|
||||
- Supports both Chat Completions (`/v1/chat/completions`) and Responses API (`/v1/responses`)
|
||||
- Provider detection based on base URL patterns
|
||||
|
||||
**Provider Support Required:**
|
||||
```typescript
|
||||
// Detected providers based on baseURL patterns
|
||||
type Provider = "openai" | "gemini" | "groq" | "anthropic" | "openrouter" | "other"
|
||||
|
||||
// Provider-specific configurations
|
||||
interface ProviderConfig {
|
||||
openai: { reasoning_effort: "minimal" | "low" | "medium" | "high" }
|
||||
gemini: { extra_body: { google: { thinking_config: { thinking_budget: number, include_thoughts: boolean } } } }
|
||||
groq: { reasoning_format: "parsed", reasoning_effort: string }
|
||||
openrouter: { reasoning: { effort: "low" | "medium" | "high" } }
|
||||
}
|
||||
```
|
||||
|
||||
### **3. Streaming vs Non-Streaming**
|
||||
|
||||
**Current Status:**
|
||||
- **No streaming currently implemented** - uses standard request/response
|
||||
- All API calls are non-streaming: `await client.chat.completions.create()` and `await client.responses.create()`
|
||||
- Events are emitted synchronously after full response
|
||||
|
||||
**Streaming Requirements for Unified API:**
|
||||
- Support for streaming responses with partial content updates
|
||||
- Event-driven streaming with `assistant_message_delta` events
|
||||
- Proper handling of tool call streaming
|
||||
- Reasoning token streaming for supported models
|
||||
|
||||
### **4. Tool Calling Architecture**
|
||||
|
||||
**Current Implementation:**
|
||||
```typescript
|
||||
// Tool definitions for both APIs
|
||||
toolsForResponses: Array<{type: "function", name: string, description: string, parameters: object}>
|
||||
toolsForChat: ChatCompletionTool[]
|
||||
|
||||
// Tool execution with abort support
|
||||
async function executeTool(name: string, args: string, signal?: AbortSignal): Promise<string>
|
||||
|
||||
// Built-in tools: read, list, bash, glob, rg (ripgrep)
|
||||
```
|
||||
|
||||
**Unified API Requirements:**
|
||||
- Automatic tool format conversion between Chat Completions and Responses API
|
||||
- Built-in tools with filesystem and shell access
|
||||
- Custom tool registration capability
|
||||
- Tool execution with proper abort/interrupt handling
|
||||
- Tool result streaming for long-running operations
|
||||
|
||||
### **5. Message Structure Handling**
|
||||
|
||||
**Current Pattern:**
|
||||
- Dual message format support based on API type
|
||||
- Automatic conversion between formats in `setEvents()` method
|
||||
|
||||
**Chat Completions Format:**
|
||||
```typescript
|
||||
{ role: "system" | "user" | "assistant" | "tool", content: string, tool_calls?: any[] }
|
||||
```
|
||||
|
||||
**Responses API Format:**
|
||||
```typescript
|
||||
{ type: "message" | "function_call" | "function_call_output", content: any[] }
|
||||
```
|
||||
|
||||
### **6. Session Persistence System**
|
||||
|
||||
**Current Implementation:**
|
||||
```typescript
|
||||
interface SessionData {
|
||||
config: AgentConfig
|
||||
events: SessionEvent[]
|
||||
totalUsage: TokenUsage
|
||||
}
|
||||
|
||||
// File-based persistence in ~/.pi/sessions/
|
||||
// JSONL format with session headers and event entries
|
||||
// Automatic session continuation support
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
- Directory-based session organization
|
||||
- Event replay capability for session restoration
|
||||
- Cumulative token usage tracking
|
||||
- Session metadata (config, timestamps, working directory)
|
||||
|
||||
### **7. Token Counting & Usage Tracking**
|
||||
|
||||
**Current Implementation:**
|
||||
```typescript
|
||||
interface TokenUsage {
|
||||
inputTokens: number
|
||||
outputTokens: number
|
||||
totalTokens: number
|
||||
cacheReadTokens: number
|
||||
cacheWriteTokens: number
|
||||
reasoningTokens: number // For o1/o3 and reasoning models
|
||||
}
|
||||
```
|
||||
|
||||
**Provider-Specific Token Mapping:**
|
||||
- OpenAI: `prompt_tokens`, `completion_tokens`, `cached_tokens`, `reasoning_tokens`
|
||||
- Responses API: `input_tokens`, `output_tokens`, `cached_tokens`, `reasoning_tokens`
|
||||
- Cumulative tracking across conversations
|
||||
|
||||
### **8. Abort/Interrupt Handling**
|
||||
|
||||
**Current Pattern:**
|
||||
```typescript
|
||||
class Agent {
|
||||
private abortController: AbortController | null = null
|
||||
|
||||
async ask(message: string) {
|
||||
this.abortController = new AbortController()
|
||||
// Pass signal to all API calls and tool executions
|
||||
}
|
||||
|
||||
interrupt(): void {
|
||||
this.abortController?.abort()
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
- AbortController integration for all async operations
|
||||
- Graceful interruption of API calls, tool execution, and streaming
|
||||
- Proper cleanup and "interrupted" event emission
|
||||
- Signal propagation to nested operations
|
||||
|
||||
### **9. Reasoning/Thinking Support**
|
||||
|
||||
**Current Implementation:**
|
||||
```typescript
|
||||
// Provider-specific reasoning extraction
|
||||
function parseReasoningFromMessage(message: any, baseURL?: string): {
|
||||
cleanContent: string
|
||||
reasoningTexts: string[]
|
||||
}
|
||||
|
||||
// Automatic reasoning support detection
|
||||
async function checkReasoningSupport(client, model, api, baseURL, signal): Promise<boolean>
|
||||
```
|
||||
|
||||
**Provider Support:**
|
||||
- **OpenAI o1/o3**: Full thinking content via Responses API
|
||||
- **Groq GPT-OSS**: Reasoning via `reasoning_format: "parsed"`
|
||||
- **Gemini 2.5**: Thinking content via `<thought>` tags
|
||||
- **OpenRouter**: Model-dependent reasoning support
|
||||
|
||||
### **10. Error Handling Patterns**
|
||||
|
||||
**Current Approach:**
|
||||
- Try/catch blocks around all API calls
|
||||
- Error events emitted through event system
|
||||
- Specific error handling for reasoning model failures
|
||||
- Provider-specific error interpretation
|
||||
|
||||
### **11. Configuration Management**
|
||||
|
||||
**Current Structure:**
|
||||
```typescript
|
||||
interface AgentConfig {
|
||||
apiKey: string
|
||||
baseURL: string
|
||||
model: string
|
||||
api: "completions" | "responses"
|
||||
systemPrompt: string
|
||||
}
|
||||
```
|
||||
|
||||
**Provider Detection:**
|
||||
```typescript
|
||||
function detectProvider(baseURL?: string): Provider {
|
||||
// URL pattern matching for automatic provider configuration
|
||||
}
|
||||
```
|
||||
|
||||
### **12. Output Rendering System**
|
||||
|
||||
**Current Renderers:**
|
||||
- **ConsoleRenderer**: Terminal output with animations, token display
|
||||
- **TuiRenderer**: Full interactive TUI with pi-tui integration
|
||||
- **JsonRenderer**: JSONL event stream output
|
||||
|
||||
**Requirements:**
|
||||
- Event-based rendering architecture
|
||||
- Real-time token usage display
|
||||
- Loading animations for async operations
|
||||
- Markdown rendering support
|
||||
- Tool execution progress indication
|
||||
|
||||
### **Summary: Key Unified API Requirements**
|
||||
|
||||
1. **Event-driven architecture** with standardized event types
|
||||
2. **Dual API support** (Chat Completions + Responses API) with automatic format conversion
|
||||
3. **Provider abstraction** with automatic detection and configuration
|
||||
4. **Comprehensive tool system** with abort support and built-in tools
|
||||
5. **Session persistence** with event replay and token tracking
|
||||
6. **Reasoning/thinking support** across multiple providers
|
||||
7. **Interrupt handling** with AbortController integration
|
||||
8. **Token usage tracking** with provider-specific mapping
|
||||
9. **Flexible rendering** through event receiver pattern
|
||||
10. **Configuration management** with provider-specific settings
|
||||
|
||||
The unified API should maintain this event-driven, provider-agnostic approach while adding streaming capabilities and enhanced tool execution features that the current implementation lacks.
|
||||
|
|
@ -1,46 +0,0 @@
|
|||
# Create AI Package with Unified API
|
||||
|
||||
**Status:** Done
|
||||
**Agent PID:** 10965
|
||||
|
||||
## Original Todo
|
||||
ai: create a new package ai (package name @mariozechner/ai) which implements a common api for the openai, anthropic, and google gemini apis
|
||||
- look at the other packages and how they are set up, mirror that setup for ai
|
||||
- install the latest version of each dependency via npm in the ai package
|
||||
- openai@5.12.2
|
||||
- @anthropic-ai/sdk@0.60.0
|
||||
- @google/genai@1.14.0
|
||||
- investigate the APIs in their respective node_modules folder so you understand how to use them. specifically, we need to understand how to:
|
||||
- stream responses, including reasoning/thinking tokens and tool calls
|
||||
- abort requests
|
||||
- handle errors
|
||||
- handle stop reasons
|
||||
- maintain the context (message history) such that it can be serialized in a uniform format to disk, and deserialized again later and used with the other api
|
||||
- count tokens (input, output, cached read, cached write)
|
||||
- enable caching
|
||||
- Create a plan.md in the ai package that details how the unified API on top of all three could look like. we want the most minimal api possible, which allows serialization/deserialization, turning on/off reasoning/thinking, and handle system prompt and tool specifications
|
||||
|
||||
## Description
|
||||
Create the initial package scaffold for @mariozechner/ai following the established monorepo patterns, install the required dependencies (openai, anthropic, google genai SDKs), and create a plan.md file that details the unified API design for all three providers.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Create package directory structure at packages/ai/
|
||||
- [x] Create package.json with proper configuration following monorepo patterns
|
||||
- [x] Create tsconfig.build.json for build configuration
|
||||
- [x] Create initial src/index.ts file
|
||||
- [x] Add package to root tsconfig.json path mappings
|
||||
- [x] Update root package.json build script to include ai package
|
||||
- [x] Install dependencies: openai@5.12.2, @anthropic-ai/sdk@0.60.0, @google/genai@1.14.0
|
||||
- [x] Create README.md with package description
|
||||
- [x] Create plan.md detailing the unified API design
|
||||
- [x] Investigate OpenAI, Anthropic, and Gemini APIs in detail
|
||||
- [x] Document implementation details for each API
|
||||
- [x] Update todos/project-description.md with "How to Create a New Package" section
|
||||
- [x] Update todos/project-description.md Testing section to reflect that tui has Node.js built-in tests
|
||||
- [x] Run npm install from root to link everything
|
||||
- [x] Verify package builds correctly with npm run build
|
||||
|
||||
## Notes
|
||||
[Implementation notes]
|
||||
|
|
@ -1,402 +0,0 @@
|
|||
# AI Package Implementation Analysis
|
||||
|
||||
## Overview
|
||||
Based on the comprehensive plan in `packages/ai/plan.md` and detailed API documentation for OpenAI, Anthropic, and Gemini SDKs, the AI package needs to provide a unified API that abstracts over these three providers while maintaining their unique capabilities.
|
||||
|
||||
## OpenAI Responses API Investigation
|
||||
|
||||
### API Structure
|
||||
The OpenAI SDK includes a separate Responses API (`client.responses`) alongside the Chat Completions API. This API is designed for models with reasoning capabilities (o1/o3) and provides access to thinking/reasoning content.
|
||||
|
||||
### Key Differences from Chat Completions API
|
||||
|
||||
1. **Input Format**: Uses `input` array instead of `messages`
|
||||
- Supports `EasyInputMessage` type with roles: `user`, `assistant`, `system`, `developer`
|
||||
- Content can be text, image, audio, or file references
|
||||
- More structured approach with explicit types for each input type
|
||||
|
||||
2. **Streaming Events**: Rich set of events for detailed streaming
|
||||
- `ResponseReasoningTextDeltaEvent` - Incremental reasoning/thinking text
|
||||
- `ResponseReasoningTextDoneEvent` - Complete reasoning text
|
||||
- `ResponseTextDeltaEvent` - Main response text deltas
|
||||
- `ResponseFunctionCallArgumentsDeltaEvent` - Tool call argument streaming
|
||||
- `ResponseCompletedEvent` - Final completion with usage stats
|
||||
|
||||
3. **Response Structure**: More complex response object
|
||||
- `output` array containing various output items
|
||||
- Explicit reasoning items with content
|
||||
- Tool calls as part of output items
|
||||
- Usage tracking with detailed token breakdowns
|
||||
|
||||
### Implementation Examples
|
||||
|
||||
#### Basic Responses API Usage
|
||||
|
||||
```typescript
|
||||
// Creating a response with streaming
|
||||
const stream = await client.responses.create({
|
||||
model: "o1-preview",
|
||||
input: [
|
||||
{
|
||||
role: "developer", // or "system" for non-reasoning models
|
||||
content: "You are a helpful assistant"
|
||||
},
|
||||
{
|
||||
role: "user",
|
||||
content: "Explain quantum computing step by step"
|
||||
}
|
||||
],
|
||||
stream: true,
|
||||
temperature: 0.7,
|
||||
max_completion_tokens: 2000
|
||||
});
|
||||
|
||||
// Process streaming events
|
||||
for await (const event of stream) {
|
||||
switch (event.type) {
|
||||
case 'response.reasoning_text.delta':
|
||||
// Thinking/reasoning content
|
||||
console.log('[THINKING]', event.delta);
|
||||
break;
|
||||
|
||||
case 'response.text.delta':
|
||||
// Main response text
|
||||
console.log('[RESPONSE]', event.delta);
|
||||
break;
|
||||
|
||||
case 'response.function_call_arguments.delta':
|
||||
// Tool call arguments being built
|
||||
console.log('[TOOL ARGS]', event.delta);
|
||||
break;
|
||||
|
||||
case 'response.completed':
|
||||
// Final response with usage
|
||||
console.log('Usage:', event.usage);
|
||||
break;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Using ResponseStream Helper
|
||||
|
||||
```typescript
|
||||
// The SDK provides a ResponseStream helper for easier streaming
|
||||
const responseStream = client.responses.stream({
|
||||
model: "o1-preview",
|
||||
input: [
|
||||
{ role: "user", content: "Solve this math problem..." }
|
||||
],
|
||||
tools: [
|
||||
{
|
||||
type: "function",
|
||||
function: {
|
||||
name: "calculate",
|
||||
description: "Perform calculations",
|
||||
parameters: { /* JSON Schema */ }
|
||||
}
|
||||
}
|
||||
]
|
||||
});
|
||||
|
||||
// Get final response after streaming
|
||||
const finalResponse = await responseStream.finalResponse();
|
||||
console.log('Output:', finalResponse.output);
|
||||
console.log('Usage:', finalResponse.usage);
|
||||
```
|
||||
|
||||
#### Converting Messages for Responses API
|
||||
|
||||
```typescript
|
||||
private convertToResponsesInput(messages: Message[], systemPrompt?: string): ResponseInputItem[] {
|
||||
const input: ResponseInputItem[] = [];
|
||||
|
||||
// Add system/developer prompt
|
||||
if (systemPrompt) {
|
||||
input.push({
|
||||
type: "message",
|
||||
role: this.isReasoningModel() ? "developer" : "system",
|
||||
content: systemPrompt
|
||||
});
|
||||
}
|
||||
|
||||
// Convert messages
|
||||
for (const msg of messages) {
|
||||
if (msg.role === "user") {
|
||||
input.push({
|
||||
type: "message",
|
||||
role: "user",
|
||||
content: msg.content
|
||||
});
|
||||
} else if (msg.role === "assistant") {
|
||||
// Assistant messages with potential tool calls
|
||||
const outputMessage: ResponseOutputMessage = {
|
||||
type: "message",
|
||||
role: "assistant",
|
||||
content: []
|
||||
};
|
||||
|
||||
if (msg.content) {
|
||||
outputMessage.content.push({
|
||||
type: "text",
|
||||
text: msg.content
|
||||
});
|
||||
}
|
||||
|
||||
if (msg.toolCalls) {
|
||||
// Tool calls need to be added as separate output items
|
||||
for (const toolCall of msg.toolCalls) {
|
||||
input.push({
|
||||
type: "function_call",
|
||||
id: toolCall.id,
|
||||
name: toolCall.name,
|
||||
arguments: JSON.stringify(toolCall.arguments)
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
input.push(outputMessage);
|
||||
} else if (msg.role === "toolResult") {
|
||||
// Tool results as function call outputs
|
||||
input.push({
|
||||
type: "function_call_output",
|
||||
call_id: msg.toolCallId,
|
||||
output: msg.content
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
return input;
|
||||
}
|
||||
```
|
||||
|
||||
#### Processing Responses API Events
|
||||
|
||||
```typescript
|
||||
private async completeWithResponsesAPI(request: Request, options?: OpenAIOptions): Promise<AssistantMessage> {
|
||||
try {
|
||||
const input = this.convertToResponsesInput(request.messages, request.systemPrompt);
|
||||
|
||||
const stream = await this.client.responses.create({
|
||||
model: this.model,
|
||||
input,
|
||||
stream: true,
|
||||
max_completion_tokens: request.maxTokens,
|
||||
temperature: request.temperature,
|
||||
tools: request.tools ? this.convertTools(request.tools) : undefined,
|
||||
tool_choice: options?.toolChoice
|
||||
});
|
||||
|
||||
let content = "";
|
||||
let thinking = "";
|
||||
const toolCalls: ToolCall[] = [];
|
||||
let usage: TokenUsage = { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 };
|
||||
let finishReason: string = "stop";
|
||||
|
||||
for await (const event of stream) {
|
||||
switch (event.type) {
|
||||
case 'response.reasoning_text.delta':
|
||||
thinking += event.delta;
|
||||
request.onThinking?.(event.delta);
|
||||
break;
|
||||
|
||||
case 'response.reasoning_text.done':
|
||||
// Complete reasoning text available
|
||||
thinking = event.text;
|
||||
break;
|
||||
|
||||
case 'response.text.delta':
|
||||
content += event.delta;
|
||||
request.onText?.(event.delta);
|
||||
break;
|
||||
|
||||
case 'response.function_call_arguments.delta':
|
||||
// Build up tool calls incrementally
|
||||
// event.item_id identifies which tool call
|
||||
// event.arguments contains the delta
|
||||
break;
|
||||
|
||||
case 'response.function_call_arguments.done':
|
||||
// Complete tool call
|
||||
toolCalls.push({
|
||||
id: event.item_id,
|
||||
name: event.name,
|
||||
arguments: JSON.parse(event.arguments)
|
||||
});
|
||||
break;
|
||||
|
||||
case 'response.completed':
|
||||
// Final event with complete response and usage
|
||||
usage = {
|
||||
input: event.usage.input_tokens,
|
||||
output: event.usage.output_tokens,
|
||||
cacheRead: event.usage.input_tokens_details?.cached_tokens || 0,
|
||||
cacheWrite: 0
|
||||
};
|
||||
finishReason = event.stop_reason || "stop";
|
||||
break;
|
||||
|
||||
case 'response.error':
|
||||
throw new Error(event.error.message);
|
||||
}
|
||||
}
|
||||
|
||||
return {
|
||||
role: "assistant",
|
||||
content: content || undefined,
|
||||
thinking: thinking || undefined,
|
||||
toolCalls: toolCalls.length > 0 ? toolCalls : undefined,
|
||||
model: this.model,
|
||||
usage,
|
||||
stopReason: this.mapStopReason(finishReason)
|
||||
};
|
||||
} catch (error) {
|
||||
// Error handling...
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Important Notes
|
||||
|
||||
1. **"[Thinking: X tokens]" Issue**: The current implementation shows a placeholder for thinking tokens in Chat Completions API. This should only show actual thinking content from Responses API or omit the field entirely.
|
||||
|
||||
2. **Tool Calling Differences**: Responses API handles tool calls differently, with separate events for arguments delta and completion.
|
||||
|
||||
3. **Usage Tracking**: Responses API provides more detailed usage information including reasoning tokens in a different structure.
|
||||
|
||||
4. **Stream vs Iterator**: The Responses API returns an async iterable that can be used with `for await...of` directly.
|
||||
|
||||
## Existing Codebase Context
|
||||
|
||||
### Current Structure
|
||||
- Monorepo using npm workspaces with packages in `packages/` directory
|
||||
- Existing packages: `tui`, `agent`, `pods`
|
||||
- TypeScript/ESM modules with Node.js ≥20.0.0
|
||||
- Biome for linting and formatting
|
||||
- Lockstep versioning at 0.5.8
|
||||
|
||||
### Package Location
|
||||
The AI package should be created at `packages/ai/` following the existing pattern.
|
||||
|
||||
## Key Implementation Requirements
|
||||
|
||||
### Core Features
|
||||
1. **Unified Client API** - Single interface for all providers
|
||||
2. **Streaming First** - All providers support streaming, non-streaming is collected events
|
||||
3. **Provider Adapters** - OpenAI, Anthropic, Gemini adapters
|
||||
4. **Event Normalization** - Consistent event types across providers
|
||||
5. **Tool/Function Calling** - Unified interface for tools across providers
|
||||
6. **Thinking/Reasoning** - Support for reasoning models (o1/o3, Claude thinking, Gemini thinking)
|
||||
7. **Token Tracking** - Usage and cost calculation
|
||||
8. **Abort Support** - Request cancellation via AbortController
|
||||
9. **Error Mapping** - Normalized error handling
|
||||
10. **Caching** - Automatic caching strategies per provider
|
||||
|
||||
### Provider-Specific Handling
|
||||
|
||||
#### OpenAI
|
||||
- Dual APIs: Chat Completions vs Responses API
|
||||
- Responses API for o1/o3 reasoning content
|
||||
- Developer role for o1/o3 system prompts
|
||||
- Stream options for token usage
|
||||
|
||||
#### Anthropic
|
||||
- Content blocks always arrays
|
||||
- Separate system parameter
|
||||
- Tool results as user messages
|
||||
- Explicit thinking budget allocation
|
||||
- Cache control per block
|
||||
|
||||
#### Gemini
|
||||
- Parts-based content system
|
||||
- Separate systemInstruction parameter
|
||||
- Model role instead of assistant
|
||||
- Thinking via part.thought flag
|
||||
- Function calls in parts array
|
||||
|
||||
## Implementation Structure
|
||||
|
||||
```
|
||||
packages/ai/
|
||||
├── src/
|
||||
│ ├── index.ts # Main exports
|
||||
│ ├── types.ts # Unified type definitions
|
||||
│ ├── client.ts # Main AI client class
|
||||
│ ├── adapters/
|
||||
│ │ ├── base.ts # Base adapter interface
|
||||
│ │ ├── openai.ts # OpenAI adapter
|
||||
│ │ ├── anthropic.ts # Anthropic adapter
|
||||
│ │ └── gemini.ts # Gemini adapter
|
||||
│ ├── models/
|
||||
│ │ ├── models.ts # Model info lookup
|
||||
│ │ └── models-data.ts # Generated models database
|
||||
│ ├── errors.ts # Error mapping
|
||||
│ ├── events.ts # Event stream handling
|
||||
│ ├── costs.ts # Cost tracking
|
||||
│ └── utils.ts # Utility functions
|
||||
├── test/
|
||||
│ ├── openai.test.ts
|
||||
│ ├── anthropic.test.ts
|
||||
│ └── gemini.test.ts
|
||||
├── scripts/
|
||||
│ └── update-models.ts # Update models database
|
||||
├── package.json
|
||||
├── tsconfig.build.json
|
||||
└── README.md
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
- `openai`: ^5.12.2 (for OpenAI SDK)
|
||||
- `@anthropic-ai/sdk`: Latest
|
||||
- `@google/genai`: Latest
|
||||
|
||||
## Files to Create/Modify
|
||||
|
||||
### New Files in packages/ai/
|
||||
1. `package.json` - Package configuration
|
||||
2. `tsconfig.build.json` - TypeScript build config
|
||||
3. `src/index.ts` - Main exports
|
||||
4. `src/types.ts` - Type definitions
|
||||
5. `src/client.ts` - Main AI class
|
||||
6. `src/adapters/base.ts` - Base adapter
|
||||
7. `src/adapters/openai.ts` - OpenAI implementation
|
||||
8. `src/adapters/anthropic.ts` - Anthropic implementation
|
||||
9. `src/adapters/gemini.ts` - Gemini implementation
|
||||
10. `src/models/models.ts` - Model info
|
||||
11. `src/errors.ts` - Error handling
|
||||
12. `src/events.ts` - Event streaming
|
||||
13. `src/costs.ts` - Cost tracking
|
||||
14. `README.md` - Package documentation
|
||||
|
||||
### Files to Modify
|
||||
1. Root `tsconfig.json` - Add path mapping for @mariozechner/pi-ai
|
||||
2. Root `package.json` - Add to build script order
|
||||
|
||||
## Implementation Strategy
|
||||
|
||||
### Phase 1: Core Structure
|
||||
- Create package structure and configuration
|
||||
- Define unified types and interfaces
|
||||
- Implement base adapter interface
|
||||
|
||||
### Phase 2: Provider Adapters
|
||||
- Implement OpenAI adapter (both APIs)
|
||||
- Implement Anthropic adapter
|
||||
- Implement Gemini adapter
|
||||
|
||||
### Phase 3: Features
|
||||
- Add streaming support
|
||||
- Implement tool calling
|
||||
- Add thinking/reasoning support
|
||||
- Implement token tracking
|
||||
|
||||
### Phase 4: Polish
|
||||
- Error mapping and handling
|
||||
- Cost calculation
|
||||
- Model information database
|
||||
- Documentation and examples
|
||||
|
||||
## Testing Approach
|
||||
- Unit tests for each adapter
|
||||
- Integration tests with mock responses
|
||||
- Example scripts for manual testing
|
||||
- Verify streaming, tools, thinking for each provider
|
||||
|
|
@ -1,43 +0,0 @@
|
|||
# AI Package Implementation Plan
|
||||
**Status:** Done
|
||||
**Agent PID:** 54145
|
||||
|
||||
## Original Todo
|
||||
ai: create an implementation plan based on packages/ai/plan.md and implement it
|
||||
|
||||
## Description
|
||||
Implement the unified AI API as designed in packages/ai/plan.md. Create a single interface that works with OpenAI, Anthropic, and Gemini SDKs, handling their differences internally while exposing unified streaming events, tool calling, thinking/reasoning, and caching capabilities.
|
||||
|
||||
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
||||
*Read [plan.md](packages/ai/docs/plan.md) in full for the complete API design and implementation details*
|
||||
*Read API documentation: [anthropic-api.md](packages/ai/docs/anthropic-api.md), [openai-api.md](packages/ai/docs/openai-api.md), [gemini-api.md](packages/ai/docs/gemini-api.md)*
|
||||
|
||||
## Implementation Plan
|
||||
- [x] Define unified types in src/types.ts based on plan.md interfaces (AIConfig, Message, Request, Event, TokenUsage, ModelInfo)
|
||||
- [x] Implement OpenAI provider in src/providers/openai.ts with both Chat Completions and Responses API support
|
||||
- [x] Implement Anthropic provider in src/providers/anthropic.ts with MessageStream and content blocks handling
|
||||
- [ ] Implement Gemini provider in src/providers/gemini.ts with parts system and thinking extraction
|
||||
- [ ] Create main AI class in src/index.ts that selects and uses appropriate adapter
|
||||
- [ ] Implement models database in src/models.ts with model information and cost data
|
||||
- [ ] Add cost calculation integrated into each adapter's token tracking
|
||||
- [ ] Create comprehensive test suite in test/ai.test.ts using Node.js test framework
|
||||
- [ ] Test: Model database lookup and capabilities detection
|
||||
- [ ] Test: Basic completion (non-streaming) for all providers (OpenAI, Anthropic, Gemini, OpenRouter, Groq)
|
||||
- [ ] Test: Streaming responses with event normalization across all providers
|
||||
- [ ] Test: Thinking/reasoning extraction (o1 via Responses API, Claude thinking, Gemini thinking)
|
||||
- [ ] Test: Tool calling flow with execution and continuation across providers
|
||||
- [ ] Test: Automatic caching (Anthropic explicit, OpenAI/Gemini automatic)
|
||||
- [ ] Test: Message serialization/deserialization with full conversation history
|
||||
- [ ] Test: Cross-provider conversation continuation (start with one provider, continue with another)
|
||||
- [ ] Test: Abort/cancellation via AbortController
|
||||
- [ ] Test: Error handling and retry logic for each provider
|
||||
- [ ] Test: Cost tracking accuracy with known token counts
|
||||
- [ ] Update root tsconfig.json paths to include @mariozechner/pi-ai
|
||||
- [ ] Update root package.json build script to include AI package
|
||||
|
||||
## Notes
|
||||
- Package structure already exists at packages/ai with dependencies installed
|
||||
- Each adapter handles its own event normalization internally
|
||||
- Tests use Node.js built-in test framework as per project conventions
|
||||
- Available API keys: OPENAI_API_KEY, ANTHROPIC_API_KEY, GEMINI_API_KEY, GROQ_API_KEY, OPENROUTER_API_KEY
|
||||
- **IMPORTANT**: Always run `npm run check` in the root directory before asking for approval to ensure code compiles and passes linting
|
||||
|
|
@ -1,149 +0,0 @@
|
|||
# Project: Pi Monorepo
|
||||
|
||||
A comprehensive toolkit for managing Large Language Model (LLM) deployments and building AI agents, specifically designed for deploying and managing LLMs on remote GPU pods with automatic vLLM configuration for agentic workloads.
|
||||
|
||||
## Features
|
||||
- Unified LLM API with automatic model discovery and provider configuration
|
||||
- Terminal UI framework with differential rendering and interactive components
|
||||
- AI agent framework with tool calling, session persistence, and multiple renderers
|
||||
- GPU pod management CLI for automated vLLM deployment on various providers
|
||||
- Support for OpenAI, Anthropic, Google, Groq, Cerebras, xAI, OpenRouter, and compatible APIs
|
||||
- Built-in file system tools for agentic AI capabilities
|
||||
- Automatic cost tracking and token usage across all providers
|
||||
|
||||
## Tech Stack
|
||||
- TypeScript/JavaScript with ES Modules
|
||||
- Node.js ≥20.0.0
|
||||
- OpenAI SDK, Anthropic SDK, Google Gemini SDK for LLM integration
|
||||
- Custom TUI library with differential rendering
|
||||
- Biome for linting and formatting
|
||||
- npm workspaces for monorepo structure
|
||||
- Automatic model discovery from OpenRouter and models.dev APIs
|
||||
|
||||
## Structure
|
||||
- `packages/tui/` - Terminal UI library (@mariozechner/pi-tui)
|
||||
- `packages/ai/` - Unified LLM API (@mariozechner/pi-ai)
|
||||
- `packages/agent/` - AI agent with tool calling (@mariozechner/pi-agent)
|
||||
- `packages/pods/` - CLI for GPU pod management (@mariozechner/pi)
|
||||
- `scripts/` - Utility scripts for version sync
|
||||
- `todos/` - Task tracking
|
||||
|
||||
## Architecture
|
||||
- Unified LLM interface abstracting provider differences
|
||||
- Event-driven agent system with publish-subscribe pattern
|
||||
- Component-based TUI with differential rendering
|
||||
- SSH-based remote pod management
|
||||
- Tool calling system for file operations (read, bash, glob, ripgrep)
|
||||
- Session persistence in JSONL format
|
||||
- Multiple renderer strategies (Console, TUI, JSON)
|
||||
- Automatic model capability detection (reasoning, vision, tool calling)
|
||||
|
||||
## Commands
|
||||
- Build: `npm run build`
|
||||
- Clean: `npm run clean`
|
||||
- Lint/Check: `npm run check`
|
||||
- Dev/Run: `npx tsx packages/agent/src/cli.ts` (pi-agent), `npx tsx packages/pods/src/cli.ts` (pi)
|
||||
- Version: `npm run version:patch/minor/major`
|
||||
- Publish: `npm run publish`
|
||||
- Publish (dry run): `npm run publish:dry`
|
||||
|
||||
## Testing
|
||||
The monorepo includes comprehensive tests using Node.js built-in test framework and Vitest:
|
||||
- TUI package: Unit tests in `packages/tui/test/*.test.ts` (Node.js test framework)
|
||||
- AI package: Provider tests in `packages/ai/test/*.test.ts` (Vitest)
|
||||
- Test runner (TUI): `node --test --import tsx test/*.test.ts`
|
||||
- Test runner (AI): `npm run test` (uses Vitest)
|
||||
- Virtual terminal for TUI testing via `@xterm/headless`
|
||||
- Example applications for manual testing
|
||||
|
||||
## How to Create a New Package
|
||||
|
||||
Follow these steps to add a new package to the monorepo:
|
||||
|
||||
1. **Create package directory structure:**
|
||||
```bash
|
||||
mkdir -p packages/your-package/src
|
||||
```
|
||||
|
||||
2. **Create package.json:**
|
||||
```json
|
||||
{
|
||||
"name": "@mariozechner/your-package",
|
||||
"version": "0.5.12",
|
||||
"description": "Package description",
|
||||
"type": "module",
|
||||
"main": "./dist/index.js",
|
||||
"types": "./dist/index.d.ts",
|
||||
"files": ["dist", "README.md"],
|
||||
"scripts": {
|
||||
"clean": "rm -rf dist",
|
||||
"build": "tsc -p tsconfig.build.json",
|
||||
"check": "biome check --write .",
|
||||
"prepublishOnly": "npm run clean && npm run build"
|
||||
},
|
||||
"dependencies": {},
|
||||
"devDependencies": {},
|
||||
"keywords": ["relevant", "keywords"],
|
||||
"author": "Mario Zechner",
|
||||
"license": "MIT",
|
||||
"repository": {
|
||||
"type": "git",
|
||||
"url": "git+https://github.com/badlogic/pi-mono.git",
|
||||
"directory": "packages/your-package"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=20.0.0"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
3. **Create tsconfig.build.json:**
|
||||
```json
|
||||
{
|
||||
"extends": "../../tsconfig.base.json",
|
||||
"compilerOptions": {
|
||||
"outDir": "./dist",
|
||||
"rootDir": "./src"
|
||||
},
|
||||
"include": ["src/**/*"],
|
||||
"exclude": ["node_modules", "dist"]
|
||||
}
|
||||
```
|
||||
|
||||
4. **Create src/index.ts:**
|
||||
```typescript
|
||||
// Main exports for your package
|
||||
export const version = "0.5.12";
|
||||
```
|
||||
|
||||
5. **Update root tsconfig.json paths:**
|
||||
Add your package to the `paths` mapping in the correct dependency order:
|
||||
```json
|
||||
"paths": {
|
||||
"@mariozechner/pi-tui": ["./packages/tui/src/index.ts"],
|
||||
"@mariozechner/pi-ai": ["./packages/ai/src/index.ts"],
|
||||
"@mariozechner/your-package": ["./packages/your-package/src/index.ts"],
|
||||
"@mariozechner/pi-agent": ["./packages/agent/src/index.ts"],
|
||||
"@mariozechner/pi": ["./packages/pods/src/index.ts"]
|
||||
}
|
||||
```
|
||||
|
||||
6. **Update root package.json build script:**
|
||||
Insert your package in the correct dependency order:
|
||||
```json
|
||||
"build": "npm run build -w @mariozechner/pi-tui && npm run build -w @mariozechner/pi-ai && npm run build -w @mariozechner/your-package && npm run build -w @mariozechner/pi-agent && npm run build -w @mariozechner/pi"
|
||||
```
|
||||
|
||||
7. **Install and verify:**
|
||||
```bash
|
||||
npm install
|
||||
npm run build
|
||||
npm run check
|
||||
```
|
||||
|
||||
**Important Notes:**
|
||||
- All packages use lockstep versioning (same version number)
|
||||
- Follow dependency order: foundational packages build first
|
||||
- Use ESM modules (`"type": "module"`)
|
||||
- No `any` types unless absolutely necessary
|
||||
- Include README.md with package documentation
|
||||
|
|
@ -1,21 +0,0 @@
|
|||
- ai: test abort signal
|
||||
|
||||
- ai: implement and test session hand-off
|
||||
- thinkingSignatures are incompatible between models/providers
|
||||
- when converting Message instance, LLM impl needs to check model
|
||||
- if same provider/model as LLM impl config, convert as is
|
||||
- if provider and/or model != LLM impl config, convert thinking to plain user text Message with "Thinking: " prepended
|
||||
|
||||
- tui: use stripVTControlCharacters in components to strip ANSI sequences and better estimate line widths? specifically markdown and text component?
|
||||
|
||||
- tui: if text editor gets bigger than viewport, we get artifacts in scrollbuffer
|
||||
|
||||
- tui: need to benachmark our renderer. always compares old lines vs new lines and does a diff. might be a bit much for 100k+ lines.
|
||||
|
||||
- pods: pi start outputs all models that can be run on the pod. however, it doesn't check the vllm version. e.g. gpt-oss can only run via vllm+gpt-oss. glm4.5 can only run on vllm nightly.
|
||||
|
||||
- agent: we need to make system prompt and tools pluggable. We need to figure out the simplest way for users to define system prompts and toolkits. A toolkit could be a subset of the built-in tools, a mixture of a subset of the built-in tools plus custom self-made tools, maybe include MCP servers, and so on. We need to figure out a way to make this super easy. users should be able to write their tools in whatever language they fancy. which means that probably something like process spawning plus studio communication transport would make the most sense. but then we were back at MCP basically. And that does not support interruptibility, which we need for the agent. So if the agent invokes the tool and the user presses escape in the interface, then the tool invocation must be interrupted and whatever it's doing must stop, including killing all sub-processes. For MCP this could be solved for studio MCP servers by, since we spawn those on startup or whenever we load the tools, we spawn a process for an MCP server and then reuse that process for subsequent tool invocations. If the user interrupts then we could just kill that process, assuming that anything it's doing or any of its sub-processes will be killed along the way. So I guess tools could all be written as MCP servers, but that's a lot of overhead. It would also be nice to be able to provide tools just as a bash script that gets some inputs and return some outputs based on the inputs Same for Go apps or TypeScript apps invoked by MPX TSX. just make the barrier of entry for writing your own tools super fucking low. not necessarily going full MCP. but we also need to support MCP. So whatever we arrive at, we then need to take our built-in tools and see if those can be refactored to work with our new tools
|
||||
|
||||
- agent: we need to make it possibly for tools to specify how their results should be rendered. Since we can have any kind of renderer, we need to come up with a general system that says "this field in the output needs to be a markdown component" or "this field in the output needs to be a diff", etc. we also need to think about how to display the inputs to tools.
|
||||
|
||||
- agent: the agent or user should be able to reload a tool, for tools that the agent keeps alive, like MCP servers.
|
||||
Loading…
Add table
Add a link
Reference in a new issue