Commit graph

57 commits

Author SHA1 Message Date
Mario Zechner
550da5e47c feat(ai): Add cost tracking to LLM implementations
- Track input/output token costs for all providers
- Calculate costs based on Model pricing information
- Include cost information in AssistantMessage responses
- Add Usage interface with detailed cost breakdown
- Implement calculateCost utility function for cost calculations
2025-08-30 00:45:08 +02:00
Mario Zechner
f9d688d577 refactor(ai): Update LLM implementations to use Model objects
- LLM constructors now take Model objects instead of string IDs
- Added provider field to AssistantMessage interface
- Updated getModel function with type-safe model ID autocomplete
- Fixed Anthropic model ID mapping for proper API aliases
- Added baseUrl to Model interface for provider-specific endpoints
- Updated all tests to use getModel for model instantiation
- Removed deprecated models.json in favor of generated models
2025-08-30 00:21:03 +02:00
Mario Zechner
d61d09b88d fix(ai): Deduplicate models and add Anthropic aliases
- Add proper Anthropic model aliases (claude-opus-4-1, claude-sonnet-4-0, etc.)
- Deduplicate models when same ID appears in both models.dev and OpenRouter
- models.dev takes priority over OpenRouter for duplicate IDs
- Fix test to use correct claude-3-5-haiku-latest alias
- Reduces Anthropic models from 11 to 10 (removed duplicate)
2025-08-29 23:34:01 +02:00
Mario Zechner
9c3f32b91e feat(ai): Add models.generated.ts with 181 tool-capable models
- Generated from OpenRouter API and models.dev
- Includes models from Google, OpenAI, Anthropic, xAI, Groq, Cerebras, OpenRouter
- Provides type-safe model selection with autocomplete
2025-08-29 23:20:53 +02:00
Mario Zechner
c7618db3f7 refactor(ai): Implement unified model system with type-safe createLLM
- Add Model interface to types.ts with normalized structure
- Create type-safe generic createLLM function with provider-specific model constraints
- Generate models from OpenRouter API and models.dev data
- Strip provider prefixes for direct providers (google, openai, anthropic, xai)
- Keep full model IDs for OpenRouter-proxied models
- Clean separation: types.ts (Model interface), models.ts (factory logic), models.generated.ts (data)
- Remove old model scripts and unused dependencies
- Rename GeminiLLM to GoogleLLM for consistency
- Add tests for new providers (xAI, Groq, Cerebras, OpenRouter)
- Support 181 tool-capable models across 7 providers with full type safety
2025-08-29 23:19:47 +02:00
Mario Zechner
3f36051bc6 feat(ai): Migrate tests to Vitest and add provider test coverage
- Switch from Node.js test runner to Vitest for better DX
- Add test suites for Grok, Groq, Cerebras, and OpenRouter providers
- Add Ollama test suite with automatic server lifecycle management
- Include thinking mode and multi-turn tests for all providers
- Remove example files (consolidated into test suite)
- Add VS Code test configuration
2025-08-29 21:32:45 +02:00
Mario Zechner
da66a97ea7 feat(ai): Add auto-generated TypeScript models with factory function
- Generate models.generated.ts from models.json with proper types
- Categorize providers: OpenAI (Responses), OpenAI-compatible, Anthropic, Gemini
- Create createLLM() factory with TypeScript overloads for type safety
- Auto-detect base URLs and environment variables for providers
- Support 353 models across 39 providers with full autocompletion
- Exclude generated file from git (rebuilt on npm build)
2025-08-25 21:31:29 +02:00
Mario Zechner
9b8ea585bd fix(ai): Improve ModelInfo types based on actual data structure
- Remove catch-all [key: string]: any from ModelInfo
- Make all required fields non-optional (attachment, reasoning, etc.)
- Add proper union types for modalities (text, image, audio, video, pdf)
- Mark only cost and knowledge fields as optional
- Export ModalityInput and ModalityOutput types
2025-08-25 20:18:34 +02:00
Mario Zechner
02a9b4f09f feat(ai): Add models.dev data integration
- Add models script to download latest model information
- Create models.ts module to query model capabilities
- Include models.json in package distribution
- Export utilities to check model features (reasoning, tools)
- Update build process to copy models.json to dist
2025-08-25 20:10:54 +02:00
Mario Zechner
4bb3a5ad02 feat(ai): Add OpenAI-compatible provider examples for multiple services
- Add examples for Cerebras, Groq, Ollama, and OpenRouter
- Update OpenAI Completions provider to handle base URL properly
- Simplify README formatting
- All examples use the same OpenAICompletionsLLM provider with different base URLs
2025-08-25 17:41:47 +02:00
Mario Zechner
6112029076 docs(ai): Update README with working quick start examples
- Replace planned features with actual working code examples
- Add clear provider comparison table
- Show real imports and usage patterns
- Include streaming, thinking, and tool calling examples
- Update supported models to match current implementation
2025-08-25 15:58:57 +02:00
Mario Zechner
7a6852081d test(ai): Add comprehensive E2E tests for all AI providers
- Add multi-turn test to verify thinking and tool calling work together
- Test thinkingSignature handling for proper multi-turn context
- Fix Gemini provider to generate base64 thinkingSignature when needed
- Handle multiple rounds of tool calls in tests (Gemini behavior)
- Make thinking tests more robust for model-dependent behavior
- All 18 tests passing across 4 providers
2025-08-25 15:54:26 +02:00
Mario Zechner
289e60ab88 fix(ai): Correct Gemini thinking config structure
- Fixed thinkingConfig to be at top level, not nested under 'config'
- Matches Gemini API documentation structure
2025-08-25 10:33:58 +02:00
Mario Zechner
3e1422d3d7 feat(ai): Add proper thinking support for Gemini 2.5 models
- Added thinkingConfig with includeThoughts and thinkingBudget support
- Use part.thought boolean flag to detect thinking content per API docs
- Capture and preserve thought signatures for multi-turn function calling
- Added supportsThinking() check for Gemini 2.5 series models
- Updated example to demonstrate thinking configuration
- Handle SDK type limitations with proper type assertions
2025-08-25 10:26:23 +02:00
Mario Zechner
a8ba19f0b4 feat(ai): Implement Gemini provider with streaming and tool support
- Added GeminiLLM provider implementation with GoogleGenerativeAI SDK
- Supports streaming with text/thinking content and completion signals
- Handles Gemini's parts-based content system (text, thought, functionCall)
- Implements tool/function calling with proper format conversion
- Maps between unified types and Gemini-specific formats (model vs assistant role)
- Added test example matching other provider patterns
- Fixed typo in AssistantMessage type (stopResaon -> stopReason) across all providers
2025-08-24 20:41:32 +02:00
Mario Zechner
cb4c32faaa refactor(ai): Add completion signal to onText/onThinking callbacks
- Update LLMOptions interface to include completion boolean parameter
- Modify all providers to signal when text/thinking blocks are complete
- Update examples to handle the completion parameter
- Move documentation files to docs/ directory
2025-08-24 20:33:26 +02:00
Mario Zechner
a42c54e6fe docs: Update file paths after moving AI docs to packages/ai/docs/
- Update task.md to reference docs in new location
- Update CLAUDE.md with project instructions
- Update analysis.md with implementation progress
2025-08-24 20:21:38 +02:00
Mario Zechner
8364ecde4a feat(ai): Add OpenAI Completions and Responses API providers
- Implement OpenAICompletionsLLM for Chat Completions API with streaming
- Implement OpenAIResponsesLLM for Responses API with reasoning support
- Update types to use LLM/Context instead of AI/Request
- Add support for reasoning tokens, tool calls, and streaming
- Create test examples for both OpenAI providers
- Update Anthropic provider to match new interface
2025-08-24 20:18:10 +02:00
Mario Zechner
e5aedfed29 feat(ai): Implement unified AI API with Anthropic provider
- Define clean API with complete() method and callbacks for streaming
- Add comprehensive type system for messages, tools, and usage
- Implement AnthropicAI provider with full feature support:
  - Thinking/reasoning with signatures
  - Tool calling with parallel execution
  - Streaming via callbacks (onText, onThinking)
  - Proper error handling and stop reasons
  - Cache tracking for input/output tokens
- Add working test/example demonstrating tool execution flow
- Support for system prompts, temperature, max tokens
- Proper message role types: user, assistant, toolResult
2025-08-17 23:30:20 +02:00
Mario Zechner
f064ea0e14 feat(ai): Create unified AI package with OpenAI, Anthropic, and Gemini support
- Set up @mariozechner/ai package structure following monorepo patterns
- Install OpenAI, Anthropic, and Google Gemini SDK dependencies
- Document comprehensive API investigation for all three providers
- Design minimal unified API with streaming-first architecture
- Add models.dev integration for pricing and capabilities
- Implement automatic caching strategy for all providers
- Update project documentation with package creation guide
2025-08-17 20:18:45 +02:00
Mario Zechner
2c03724862 fix: Remove unused imports and add biome-ignore for false positive
- Remove unused SlashCommand import from tui-renderer.ts
- Add biome-ignore comment for previousRenderCommands which is actually used
2025-08-16 19:21:43 +02:00
Mario Zechner
3fe1f4c11e Update todos.md 2025-08-16 19:18:07 +02:00
Mario Zechner
5bbaaa0773 Formatting 2025-08-11 21:22:11 +02:00
Mario Zechner
42bf7b4ae0 Add husky pre-commit hook for formatting and type checking 2025-08-11 21:15:37 +02:00
Mario Zechner
e21a46e68f feat(agent): Add /tokens command for cumulative token usage tracking
Added /tokens slash command to TUI that displays session-wide token statistics.
Key changes:
- Fixed SessionManager to accumulate token usage instead of storing only last event
- Added cumulative token tracking to TUI renderer alongside per-request totals
- Implemented slash command infrastructure with /tokens autocomplete support
- Fixed file autocompletion that was missing from Tab key handling
- Clean minimal display format showing input/output/reasoning/cache/tool counts

The /tokens command shows:
Total usage
   input: 1,234
   output: 567
   reasoning: 89
   cache read: 100
   cache write: 50
   tool calls: 2
2025-08-11 15:43:48 +02:00
Mario Zechner
7e3b94ade6 style(tui): Apply biome formatting fixes 2025-08-11 14:18:33 +02:00
Mario Zechner
6e40c5d761 fix(tui): Fix garbled output when content exceeds viewport
- Implemented new renderLineBased method that properly handles scrollback boundaries
- Fixed ANSI code preservation in MarkdownComponent line wrapping
- Added comprehensive test to reproduce and verify the fix
- Root cause: PARTIAL rendering strategy couldn't position cursor in scrollback
- Solution: Component-agnostic line comparison with proper viewport boundary handling
2025-08-11 14:17:46 +02:00
Mario Zechner
1d9b77298c fix(agent): Properly handle ESC interrupt in TUI with centralized event emission
Fixed the interrupt mechanism to show "[Interrupted by user]" message when ESC is pressed:
- Removed duplicate UI cleanup from ESC key handler that interfered with event processing
- Added centralized interrupted event emission in exception handler when abort signal is detected
- Removed duplicate event emissions from API call methods to prevent multiple messages
- Added abort signal support to preflight reasoning check for proper cancellation
- Simplified abort detection to only check signal state, not error messages
2025-08-11 12:21:13 +02:00
Mario Zechner
1f9d10cab0 Fix file paths. 2025-08-11 02:32:52 +02:00
Mario Zechner
192d8d2600 fix(tui): Container change detection for proper differential rendering
Fixed rendering artifact where duplicate bottom borders appeared when components
dynamically shifted positions (e.g., Ctrl+C in agent clearing status container).

Root cause: Container wasn't reporting as "changed" when cleared (0 children),
causing differential renderer to skip re-rendering that area.

Solution: Container now tracks previousChildCount and reports changed when
child count changes, ensuring proper re-rendering when containers are cleared.

- Added comprehensive test reproducing the layout shift artifact
- Fixed Container to track and report child count changes
- All tests pass including new layout shift artifact test
2025-08-11 02:31:49 +02:00
Mario Zechner
2ec8a27222 feat(tui): Add instructional header and welcome message to chat demo
- Add header with slash command and file autocomplete instructions
- Add initial welcome message with detailed feature descriptions
- Import TextComponent for the header
- Make it clearer how to use all the demo features
2025-08-11 01:36:18 +02:00
Mario Zechner
5ceaa91c74 fix(tui): Trigger initial render when start() is called
The demos were not showing any output until user input because:
- Components were added before ui.start() was called
- addChild calls requestRender() but it returns early if \!isStarted
- So no initial render happened until user input triggered one

Now ui.start() triggers an initial render if components exist.
2025-08-11 01:34:32 +02:00
Mario Zechner
838fde47ba refactor(tui): Move examples from README to test directory
- Move chat application example to test/chat-app.ts
- Move multi-component layout example to test/multi-layout.ts
- Update README to reference example files instead of inline code
- Add proper exit handlers (Ctrl+C) to all examples
- Simplify README by reducing inline code examples
2025-08-11 01:29:43 +02:00
Mario Zechner
12dfcfad23 docs(tui): Update README with surgical differential rendering documentation
- Add surgical differential rendering as the main feature
- Document the three rendering strategies (surgical, partial, full)
- Add performance metrics documentation
- Simplify component examples to be more concise
- Add comprehensive testing section with VirtualTerminal API
- Include testing best practices and performance testing guidance
- Remove duplicate TextEditor documentation section
2025-08-11 01:22:45 +02:00
Mario Zechner
386f90fc36 tui: Implement surgical differential rendering for minimal redraws
- New renderDifferentialSurgical method with three strategies:
  - SURGICAL: Only update specific changed lines (1-2 lines per render)
  - PARTIAL: Clear and re-render from first change when line counts change
  - FULL: Clear scrollback and re-render all when changes above viewport
- Preserves all content in scrollback buffer correctly
- Reduces redraws from ~14 lines to ~1.3 lines for common updates
- All 24 tests pass including scrollback preservation tests
- Massive performance improvement: 90% reduction in unnecessary redraws
2025-08-11 01:13:42 +02:00
Mario Zechner
0131b29b2c tui: Fix differential rendering to preserve scrollback buffer
- renderDifferential now correctly handles content that exceeds viewport
- When changes are above viewport, do full re-render with scrollback clear
- When changes are in viewport, do partial re-render from change point
- All tests pass, correctly preserves 100 items in scrollback
- Issue: Still re-renders too much (entire tail from first change)
2025-08-11 00:57:59 +02:00
Mario Zechner
afa807b200 tui-double-buffer: Implement smart differential rendering with terminal abstraction
- Create Terminal interface abstracting stdin/stdout operations for dependency injection
- Implement ProcessTerminal for production use with process.stdin/stdout
- Implement VirtualTerminal using @xterm/headless for accurate terminal emulation in tests
- Fix TypeScript imports for @xterm/headless module
- Move all component files to src/components/ directory for better organization
- Add comprehensive test suite with async/await patterns for proper render timing
- Fix critical TUI differential rendering bug when components grow in height
  - Issue: Old content wasn't properly cleared when component line count increased
  - Solution: Clear each old line individually before redrawing, ensure cursor at line start
- Add test verifying terminal content preservation and text editor growth behavior
- Update tsconfig.json to include test files in type checking
- Add benchmark test comparing single vs double buffer performance

The implementation successfully reduces flicker by only updating changed lines
rather than clearing entire sections. Both TUI implementations maintain the
same interface for backward compatibility.
2025-08-10 22:33:03 +02:00
Mario Zechner
923a9e58ab missing-thinking-tokens: Complete task management for reasoning token support
Moved completed task documentation to done folder after implementing reasoning token
support for OpenAI models (o1, o3, gpt-5) across all renderers and APIs
2025-08-10 14:38:25 +02:00
Mario Zechner
5d13a90077 docs: Add context window percentage to token usage display todo 2025-08-10 10:38:52 +02:00
Mario Zechner
f36299ad3a docs: Add todo for automatic context length detection
Document provider support for context length via models endpoint
and caching strategy for model metadata
2025-08-10 10:33:59 +02:00
Mario Zechner
8bee281010 Updated todos.md 2025-08-10 02:15:30 +02:00
Mario Zechner
f82e82da93 docs: Improve reasoning support table clarity
- Remove redundant 'Reasoning Tokens' column (all models count them)
- Group by provider for better readability
- Clarify model limitations vs API limitations
- Simplify check marks to focus on thinking content availability
2025-08-10 02:13:13 +02:00
Mario Zechner
047d9af407 docs: Fix outdated Gemini thinking content notes
Gemini 2.5 thinking content IS fully supported - we handle extra_body automatically
2025-08-10 01:56:20 +02:00
Mario Zechner
99ce76d66e feat(agent): Comprehensive reasoning token support across providers
Added provider-specific reasoning/thinking token support for:
- OpenAI (o1, o3, gpt-5): Full reasoning events via Responses API, token counts via Chat Completions
- Groq: reasoning_format:"parsed" for Chat Completions, no summary support for Responses
- Gemini 2.5: extra_body.google.thinking_config with <thought> tag extraction
- OpenRouter: Unified reasoning parameter with message.reasoning field
- Anthropic: Limited support via OpenAI compatibility layer

Key improvements:
- Centralized provider detection based on baseURL
- parseReasoningFromMessage() extracts provider-specific reasoning content
- adjustRequestForProvider() handles provider-specific request modifications
- Smart reasoning support detection with caching per API type
- Comprehensive README documentation with provider support matrix

Fixes reasoning tokens not appearing for GPT-5 and other reasoning models.
2025-08-10 01:46:15 +02:00
Mario Zechner
62d9eefc2a agent: Add reasoning token support for OpenAI reasoning models
- Extract and display reasoning tokens from both Chat Completions and Responses APIs
- Add smart preflight detection to check reasoning support per model/API (cached per agent)
- Support both reasoning_text (o1/o3) and summary_text (gpt-5) formats
- Display reasoning tokens with  symbol in console and TUI renderers
- Only send reasoning parameters to models that support them
- Fix event type from "thinking" to "reasoning" for consistency

Note: Chat Completions API only returns reasoning token counts, not content (by design).
Only Responses API exposes actual thinking/reasoning events.
2025-08-10 00:32:30 +02:00
Mario Zechner
9157411034 Updated agent README.md with test advise 2025-08-09 21:17:46 +02:00
Mario Zechner
832b20b173 v0.5.7: Fix tool counter spacing in metrics display 2025-08-09 20:19:04 +02:00
Mario Zechner
9fee306075 v0.5.6: Fix CLI execution when installed globally 2025-08-09 20:16:59 +02:00
Mario Zechner
db86195dd9 v0.5.5: Add tool call metrics display 2025-08-09 20:11:06 +02:00
Mario Zechner
9544a8edf9 Display tool call metrics: Add ⚒ counter to token usage display
- Show tool call count alongside token metrics in TUI and console renderers
- TUI: Display at bottom with format "↑X ↓Y ⚒Z"
- Console: Show metrics after assistant messages complete
- Counter increments on each tool_call event
2025-08-09 20:10:15 +02:00