feat(agent): Add /tokens command for cumulative token usage tracking

Added /tokens slash command to TUI that displays session-wide token statistics. Key changes: - Fixed SessionManager to accumulate token usage instead of storing only last event - Added cumulative token tracking to TUI renderer alongside per-request totals - Implemented slash command infrastructure with /tokens autocomplete support - Fixed file autocompletion that was missing from Tab key handling - Clean minimal display format showing input/output/reasoning/cache/tool counts The /tokens command shows: Total usage input: 1,234 output: 567 reasoning: 89 cache read: 100 cache write: 50 tool calls: 2
2026-04-21 16:01:05 +00:00 · 2025-08-11 15:43:48 +02:00 · 2025-08-11 15:43:48 +02:00 · e21a46e68f
commit e21a46e68f
parent 7e3b94ade6
10 changed files with 303 additions and 283 deletions
--- a/todos/done/20250811-122302-tui-garbled-output-fix-analysis.md
+++ b/todos/done/20250811-122302-tui-garbled-output-fix-analysis.md
@ -0,0 +1,93 @@
+# TUI Garbled Output Analysis
+
+## Problem Description
+When reading multiple README.md files and then sending a new message, the TUI displays garbled output. This happens for both renderDifferential and renderDifferentialSurgical methods, affecting any model (not just gpt-5).
+
+## Rendering System Overview
+
+### Three Rendering Strategies
+1. **SURGICAL Updates** - Updates only changed lines (1-2 lines typical)
+2. **PARTIAL Re-render** - Clears from first change to end, re-renders tail
+3. **FULL Re-render** - Clears scrollback and screen, renders everything
+
+### Key Components
+- **TUI Class** (`packages/tui/src/tui.ts`): Main rendering engine
+- **Container Class**: Manages child components, auto-triggers re-renders
+- **TuiRenderer** (`packages/agent/src/renderers/tui-renderer.ts`): Agent's TUI integration
+- **Event System**: Event-driven updates through AgentEvent
+
+## Root Causes Identified
+
+### 1. Complex ANSI Code Handling
+- MarkdownComponent line wrapping has issues with ANSI escape sequences
+- Code comment at line 203: "Need to wrap - this is complex with ANSI codes"
+- ANSI codes can be split across render operations, causing corruption
+
+### 2. Race Conditions in Rapid Updates
+When processing multiple tool calls:
+- Multiple containers change simultaneously
+- Content added both above and within viewport
+- Surgical renderer handles structural changes while maintaining cursor position
+- Heavy ANSI content (colored tool output, markdown) increases complexity
+
+### 3. Cursor Position Miscalculation
+- Rapid updates can cause cursor positioning logic errors
+- Content shifts due to previous renders not properly accounted for
+- Viewport vs scrollback buffer calculations can become incorrect
+
+### 4. Container Change Detection Timing
+- Recent fix (192d8d2) addressed container clear detection
+- But rapid component addition/removal still may leave artifacts
+- Multiple render requests debounced but may miss intermediate states
+
+## Specific Scenario Analysis
+
+### Sequence When Issue Occurs:
+1. User sends "read all README.md files"
+2. Multiple tool calls execute rapidly:
+   - glob() finds files
+   - Multiple read() calls for each README
+3. Long file contents displayed with markdown formatting
+4. User sends new message while output still rendering
+5. New components added while previous render incomplete
+
+### Visual Artifacts Observed:
+- Text overlapping from different messages
+- Partial ANSI codes causing color bleeding
+- Editor borders duplicated or misaligned
+- Content from previous render persisting
+- Line wrapping breaking mid-word with styling
+
+## Related Fixes
+- Commit 1d9b772: Fixed ESC interrupt handling race conditions
+- Commit 192d8d2: Fixed container change detection for clear operations
+- Commit 2ec8a27: Added instructional header to chat demo
+
+## Test Coverage Gaps
+- No tests for rapid multi-tool execution scenarios
+- Missing tests for ANSI code handling across line wraps
+- No stress tests for viewport overflow with rapid updates
+- Layout shift artifacts test exists but limited scope
+
+## Recommended Solutions
+
+### 1. Improve ANSI Handling
+- Fix MarkdownComponent line wrapping to preserve ANSI codes
+- Ensure escape sequences never split across operations
+- Add ANSI-aware string measurement utilities
+
+### 2. Add Render Queuing
+- Implement render operation queue to prevent overlaps
+- Ensure each render completes before next begins
+- Add render state tracking
+
+### 3. Enhanced Change Detection
+- Track render generation/version numbers
+- Validate cursor position before surgical updates
+- Add checksums for rendered content verification
+
+### 4. Comprehensive Testing
+- Create test simulating exact failure scenario
+- Add stress tests with rapid multi-component updates
+- Test ANSI-heavy content with line wrapping
+- Verify viewport calculations under load
--- a/todos/done/20250811-122302-tui-garbled-output-fix.md
+++ b/todos/done/20250811-122302-tui-garbled-output-fix.md
@ -0,0 +1,35 @@
+# Fix TUI Garbled Output When Sending Multiple Messages
+
+**Status:** InProgress
+**Agent PID:** 54802
+
+## Original Todo
+agent/tui: "read all README.md files except in node_modules". wait for completion, then send a new message. Getting garbled output. this happens for both of the renderDifferential and renderDifferentialSurgical methods. We need to emulate this in a test and get to the bottom of it.
+
+## Description
+Fix the TUI rendering corruption that occurs when sending multiple messages in rapid succession, particularly after tool calls that produce large outputs. The issue manifests as garbled/overlapping text when new messages are sent while previous output is still being displayed.
+
+*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
+
+## Implementation Plan
+[how we are building it]
+- [x] Create test to reproduce the issue: Simulate rapid tool calls with large outputs followed by new message
+- [x] Fix ANSI code handling in MarkdownComponent line wrapping (packages/tui/src/components/markdown-component.ts:203-276)
+- [x] Implement new line-based rendering strategy that properly handles scrollback and viewport boundaries
+- [x] Add comprehensive test coverage for multi-message scenarios
+- [ ] User test: Run agent, execute "read all README.md files", wait for completion, send new message, verify no garbled output
+
+## Notes
+- Successfully reproduced the issue with test showing garbled text overlay
+- Fixed ANSI code handling in MarkdownComponent line wrapping  
+- Root cause: PARTIAL rendering strategy incorrectly calculated cursor position when content exceeded viewport
+- When content is in scrollback, cursor can't reach it (can only move within viewport)
+- Old PARTIAL strategy tried to move cursor 33 lines up when only 30 were possible
+- This caused cursor to land at wrong position (top of viewport instead of target line in scrollback)
+- Solution: Implemented new `renderLineBased` method that:
+  - Compares old and new lines directly (component-agnostic)
+  - Detects if changes are in scrollback (unreachable) or viewport
+  - For scrollback changes: does full clear and re-render
+  - For viewport changes: moves cursor correctly within viewport bounds and updates efficiently
+  - Handles surgical line-by-line updates when possible for minimal redraws
+- Test now passes - no more garbled output when messages exceed viewport!
--- a/todos/done/20250811-150336-token-usage-tracking-analysis.md
+++ b/todos/done/20250811-150336-token-usage-tracking-analysis.md
@ -0,0 +1,161 @@
+# Token Usage Tracking Analysis - pi-agent Codebase
+
+## 1. Token Usage Event Structure and Flow
+
+### Per-Request vs Cumulative Analysis
+
+After reading `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts` in full, I can confirm that **token usage events are per-request, NOT cumulative**.
+
+**Evidence:**
+- Lines 296-308 in `callModelResponsesApi()`: Token usage is reported directly from API response usage object
+- Lines 435-447 in `callModelChatCompletionsApi()`: Token usage is reported directly from API response usage object
+- The token counts represent what was used for that specific LLM request only
+
+### TokenUsageEvent Definition
+
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/agent.ts:16-24`
+
+```typescript
+{
+    type: "token_usage";
+    inputTokens: number;
+    outputTokens: number;
+    totalTokens: number;
+    cacheReadTokens: number;
+    cacheWriteTokens: number;
+    reasoningTokens: number;
+}
+```
+
+## 2. Current Token Usage Display Implementation
+
+### TUI Renderer
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/tui-renderer.ts`
+
+**Current Behavior:**
+- Lines 60-66: Stores "last" token values (not cumulative)
+- Lines 251-259: Updates token counts on `token_usage` events
+- Lines 280-311: Displays current request tokens in `updateTokenDisplay()`
+- Format: `↑{input} ↓{output} ⚡{reasoning} ⟲{cache_read} ⟳{cache_write} ⚒ {tool_calls}`
+
+**Comment on line 252:** "Store the latest token counts (not cumulative since prompt includes full context)"
+
+### Console Renderer
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/console-renderer.ts`
+
+**Current Behavior:**
+- Lines 11-16: Stores "last" token values
+- Lines 165-172: Updates token counts on `token_usage` events  
+- Lines 52-82: Displays tokens after each assistant message
+
+## 3. Session Storage
+
+### SessionManager
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/session-manager.ts`
+
+**Current Implementation:**
+- Lines 138-146: Has a `totalUsage` field in `SessionData` interface
+- Lines 158-160: **BUG**: Only stores the LAST token_usage event, not cumulative totals
+- This should accumulate all token usage across the session
+
+## 4. Slash Command Infrastructure
+
+### Existing Slash Command Support
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/tui/src/autocomplete.ts`
+
+**Available Infrastructure:**
+- `SlashCommand` interface with `name`, `description`, optional `getArgumentCompletions`
+- `CombinedAutocompleteProvider` handles slash command detection and completion
+- Text editor auto-triggers on "/" at start of line
+
+### Current Usage in TUI Renderer
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/renderers/tui-renderer.ts:75-80`
+
+```typescript
+const autocompleteProvider = new CombinedAutocompleteProvider(
+    [],  // <-- Empty command array!
+    process.cwd(),
+);
+```
+
+**No slash commands are currently implemented in the agent TUI!**
+
+### Example Implementation
+**Reference:** `/Users/badlogic/workspaces/pi-mono/packages/tui/test/chat-app.ts:25-60`
+
+Shows how to:
+1. Define slash commands with `CombinedAutocompleteProvider`
+2. Handle slash command execution in `editor.onSubmit`
+3. Add responses to chat container
+
+## 5. Implementation Requirements for /tokens Command
+
+### What Needs to Change
+
+1. **Add Cumulative Token Tracking to TUI Renderer**
+   - Add cumulative token counters alongside current "last" counters
+   - Update cumulative totals on each `token_usage` event
+
+2. **Add /tokens Slash Command**
+   - Add to `CombinedAutocompleteProvider` in tui-renderer.ts
+   - Handle in `editor.onSubmit` callback
+   - Display formatted token summary as `TextComponent` in chat container
+
+3. **Fix SessionManager Bug**
+   - Change `totalUsage` calculation to accumulate all token_usage events
+   - This will enable session-wide token tracking
+
+4. **Message Handling in TUI**
+   - Need to capture user input before it goes to agent
+   - Check if it's a slash command vs regular message
+   - Route accordingly
+
+### Current User Input Flow
+**Location:** `/Users/badlogic/workspaces/pi-mono/packages/agent/src/main.ts:190-198`
+
+```typescript
+while (true) {
+    const userInput = await renderer.getUserInput();
+    try {
+        await agent.ask(userInput);  // All input goes to agent
+    } catch (e: any) {
+        await renderer.on({ type: "error", message: e.message });
+    }
+}
+```
+
+**Problem:** All user input goes directly to the agent - no interception for slash commands!
+
+### Required Architecture Change
+
+Need to modify the TUI interactive loop to:
+1. Check if user input starts with "/"
+2. If slash command: handle locally in renderer
+3. If regular message: pass to agent as before
+
+## 6. Token Display Format Recommendations
+
+Based on existing format patterns, the `/tokens` command should display:
+
+```
+Session Token Usage:
+↑ 1,234 input tokens
+↓ 5,678 output tokens  
+⚡ 2,345 reasoning tokens
+⟲ 890 cache read tokens
+⟳ 123 cache write tokens
+📊 12,270 total tokens
+⚒ 5 tool calls
+```
+
+## Summary
+
+The current implementation tracks per-request token usage only. To add cumulative token tracking with a `/tokens` command, we need to:
+
+1. **Fix SessionManager** to properly accumulate token usage
+2. **Add cumulative tracking** to TUI renderer  
+3. **Implement slash command infrastructure** in the agent (currently missing)
+4. **Modify user input handling** to intercept slash commands before they reach the agent
+5. **Add /tokens command** that displays formatted cumulative statistics
+
+The TUI framework already supports slash commands, but the agent TUI renderer doesn't use them yet.
--- a/todos/done/20250811-150336-token-usage-tracking.md
+++ b/todos/done/20250811-150336-token-usage-tracking.md
@ -0,0 +1,28 @@
+# Add Token Usage Tracking Command
+**Status:** Done
+**Agent PID:** 71159
+
+## Original Todo
+- agent: we get token_usage events. the last we get tells us how many input/output/cache read/cache write/reasoning tokens where used for the last request to the LLM endpoint. We want to:
+    - have a /tokens command that outputs the accumulative counts, can just add it to the chat messages container as a nicely formatted TextComponent
+    - means the tui-renderer needs to keep track of accumulative stats as well, not just last request stats.
+    - please check agent.ts (read in full) to see if token_usage is actually some form of accumulative thing, or a per request to llm thing. want to undersatnd what we get.
+
+## Description
+Add a `/tokens` slash command to the TUI that displays cumulative token usage statistics for the current session. This includes fixing the SessionManager to properly accumulate token usage and implementing slash command infrastructure in the agent's TUI renderer.
+
+*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
+
+## Implementation Plan
+- [x] Fix SessionManager to accumulate token usage instead of storing only the last event (packages/agent/src/session-manager.ts:158-160)
+- [x] Add cumulative token tracking properties to TUI renderer (packages/agent/src/renderers/tui-renderer.ts:60-66)
+- [x] Add /tokens slash command to CombinedAutocompleteProvider (packages/agent/src/renderers/tui-renderer.ts:75-80)
+- [x] Modify TUI renderer's onSubmit to handle slash commands locally (packages/agent/src/renderers/tui-renderer.ts:159-177)
+- [x] Implement /tokens command handler that displays formatted cumulative statistics
+- [x] Update token_usage event handler to accumulate totals (packages/agent/src/renderers/tui-renderer.ts:275-291)
+- [x] Test: Verify /tokens command displays correct cumulative totals
+- [x] Test: Send multiple messages and confirm accumulation works correctly
+- [x] Fix file autocompletion that was broken by slash command implementation
+
+## Notes
+[Implementation notes]