Commit graph

46 commits

Author SHA1 Message Date
Mario Zechner
0496651308 Add Anthropic prompt caching, pluggable storage, and CORS proxy support
Storage Architecture:
- New pluggable storage system with backends (LocalStorage, ChromeStorage, IndexedDB)
- SettingsRepository for app settings (proxy config, etc.)
- ProviderKeysRepository for API key management
- AppStorage with global accessors (getAppStorage, setAppStorage, initAppStorage)

Transport Refactoring:
- Renamed DirectTransport → ProviderTransport (calls LLM providers with optional CORS proxy)
- Renamed ProxyTransport → AppTransport (uses app server with user auth)
- Updated TransportMode: "direct" → "provider", "proxy" → "app"

CORS Proxy Integration:
- ProviderTransport checks proxy.enabled/proxy.url from storage
- When enabled, modifies model baseUrl to route through proxy: {proxyUrl}/?url={originalBaseUrl}
- ProviderKeyInput test function also honors proxy settings
- Settings dialog with Proxy tab (Switch toggle, URL input, explanatory description)

Anthropic Prompt Caching:
- System prompt cached with cache_control markers (both OAuth and regular API keys)
- Last user message cached to cache conversation history
- Saves 90% on input tokens for cached content (10x cost reduction)

Settings Dialog Improvements:
- Configurable tab system with SettingsTab base class
- ApiKeysTab and ProxyTab as custom elements
- Switch toggle for proxy enable (instead of Checkbox)
- Explanatory paragraphs for each tab
- ApiKeyPromptDialog reuses ProviderKeyInput component

Removed:
- Deprecated ApiKeysDialog (replaced by ProviderKeyInput in SettingsDialog)
- Old storage-adapter and key-store (replaced by new storage architecture)
2025-10-05 23:00:36 +02:00
Mario Zechner
99983af597 Fix lints. 2025-10-03 23:21:59 +02:00
Mario Zechner
51f5448a5c Remove tool calls for which there are no results in subsequent user messages. 2025-10-01 22:18:30 +02:00
Mario Zechner
f55985f633 Fix GPT-5 no-reasoning mode. Somewhat. There's no real off-switch ... 2025-09-19 01:45:00 +02:00
Mario Zechner
9e86079386 Fix block indexing in Google provider impl 2025-09-19 00:10:43 +02:00
Mario Zechner
2296dc4052 refactor(ai): improve error handling and stop reason types
- Add 'aborted' as a distinct stop reason separate from 'error'
- Change AssistantMessage.error to errorMessage for clarity
- Update error event to include reason field ('error' | 'aborted')
- Map provider-specific safety/refusal reasons to 'error' stop reason
- Reorganize utility functions into utils/ directory
- Rename agent.ts to agent-loop.ts for better clarity
- Fix error handling in all providers to properly distinguish abort from error
2025-09-18 19:57:13 +02:00
Mario Zechner
39c626b6c9 feat(ai): add partial JSON parsing for streaming tool calls
- Added partial-json package for parsing incomplete JSON during streaming
- Tool call arguments now contain partially parsed JSON during toolcall_delta events
- Enables progressive UI updates (e.g., showing file paths before content is complete)
- Arguments are always valid objects (minimum empty {}), never undefined
- Full validation still occurs at toolcall_end when arguments are complete
- Updated all providers (Anthropic, OpenAI Completions/Responses) to use parseStreamingJson
- Added comprehensive documentation and examples in README
- Added test to verify arguments are always defined during streaming
2025-09-16 12:23:34 +02:00
Mario Zechner
e8370436d7 Replace Zod with TypeBox for schema validation
- Switch from Zod to TypeBox for tool parameter schemas
- TypeBox schemas can be serialized/deserialized as JSON
- Use AJV for runtime validation instead of Zod's parse
- Add StringEnum helper for Google API compatibility (avoids anyOf/const patterns)
- Export Type and Static from main package for convenience
- Update all tests and documentation to reflect TypeBox usage
2025-09-16 01:10:40 +02:00
Mario Zechner
73d2119606 fix: Adjust max tokens for Anthropic and improve Google tools handling
- Reduce default max tokens for Anthropic to 1/3 of model max
- Fix Google provider to properly handle empty tools array
- Ensure toolConfig is undefined when no tools are present
2025-09-15 00:34:52 +02:00
Mario Zechner
433b42ac91 Fix Biome config, don't submit empty assistant messages to completions endpoint. 2025-09-09 21:47:40 +02:00
Mario Zechner
35fe8f21e9 feat(ai): Implement Zod-based tool validation and improve Agent API
- Replace JSON Schema with Zod schemas for tool parameter definitions
- Add runtime validation for all tool calls at provider level
- Create shared validation module with detailed error formatting
- Update Agent API with comprehensive event system
- Add agent tests with calculator tool for multi-turn execution
- Add abort test to verify proper handling of aborted requests
- Update documentation with detailed event flow examples
- Rename generate.ts to stream.ts for clarity
2025-09-09 14:58:54 +02:00
Mario Zechner
98a876f3a0 Fix streaming for z-ai in anthropic provider, add preliminary support for tool call streaming. Only reporting argument string deltas, not partial JSON objects 2025-09-09 04:26:56 +02:00
Mario Zechner
6c3580828d fix(ai): Ensure unique tool call IDs in Google provider
- Google provider sometimes returns duplicate or missing tool call IDs
- Added counter to ensure unique IDs for each tool call
- Check for duplicates and generate new ID when needed
- Fixes issues with multiple tool calls having the same ID
2025-09-04 12:41:58 +02:00
Mario Zechner
6679a83b32 fix(ai): Sanitize tool call IDs for Anthropic API compatibility
- Anthropic API requires tool call IDs to match pattern ^[a-zA-Z0-9_-]+$
- OpenAI Responses API generates IDs with pipe character (|) which breaks Anthropic
- Added sanitizeToolCallId() to replace invalid characters with underscores
- Fixes cross-provider handoffs from OpenAI Responses to Anthropic
- Added test to verify the fix works
2025-09-04 05:17:08 +02:00
Mario Zechner
acf0f5aee2 Clean-up 2025-09-03 00:01:32 +02:00
Mario Zechner
66cefb236e Massive refactor of API
- Switch to function based API
- Anthropic SDK style async generator
- Fully typed with escape hatches for custom models
2025-09-02 23:59:36 +02:00
Mario Zechner
004de3c9d0 feat(ai): Add new streaming generate API with AsyncIterable interface
- Implement QueuedGenerateStream class that extends AsyncIterable with finalMessage() method
- Add new types: GenerateStream, GenerateOptions, GenerateOptionsUnified, GenerateFunction
- Create generateAnthropic function-based implementation replacing class-based approach
- Add comprehensive test suite for the new generate API
- Support streaming events with text, thinking, and tool call deltas
- Map ReasoningEffort to provider-specific options
- Include apiKey in options instead of constructor parameter
2025-09-02 18:07:46 +02:00
Mario Zechner
efaa5cdb39 feat(ai): Fetch Anthropic, Google, and OpenAI models from models.dev instead of OpenRouter
- Updated generate-models.ts to fetch these providers directly from models.dev API
- OpenRouter now only used for xAI and other third-party providers
- Fixed test model IDs to match new model names from models.dev
- Removed unused import from google.ts
2025-09-02 01:18:59 +02:00
Mario Zechner
2cfd8ff3c3 fix(ai): Use API type instead of model for message compatibility checks
- Add getApi() method to all providers to identify the API type
- Add api field to AssistantMessage to track which API generated it
- Update transformMessages to check API compatibility instead of model
- Fixes issue where OpenAI Responses API failed when switching models
- Preserves thinking blocks and signatures when staying within same API
2025-09-02 00:20:06 +02:00
Mario Zechner
a62231987c fix(ai): Add anthropic-dangerous-direct-browser-access header
- Required header for browser-based access to Anthropic API
- Added to both OAuth and regular API key authentication
- Ensures full browser compatibility
2025-09-01 22:02:50 +02:00
Mario Zechner
da43e625f8 fix(ai): Add dangerouslyAllowBrowser flag for Anthropic client
- Enables browser support for Anthropic SDK
- Required for browser-based applications using the AI library
2025-09-01 21:55:52 +02:00
Mario Zechner
cf35215686 fix(ai): Fix browser compatibility for Anthropic OAuth tokens
- Check if process exists before modifying process.env
- Prevents errors in browser environments
- Maintains OAuth token functionality in Node.js
2025-09-01 21:46:22 +02:00
Mario Zechner
46b5800d36 feat(ai): Add cross-provider message handoff support
- Add transformMessages utility to handle cross-provider compatibility
- Convert thinking blocks to <thinking> tagged text when switching providers
- Preserve native thinking blocks when staying with same provider/model
- Add comprehensive handoff tests verifying all provider combinations
- Fix OpenAI Completions to return partial results on abort
- Update tool call ID format for Anthropic compatibility
- Document cross-provider handoff capabilities in README
2025-09-01 18:43:49 +02:00
Mario Zechner
bf1f410c2b refactor(ai): Update API to support partial results on abort
- Anthropic, Google, and OpenAI Responses providers now return partial results when aborted
- Restructured streaming to accumulate content blocks incrementally
- Prevents submission of thinking/toolCall blocks from aborted completions in multi-turn conversations
- Makes UI development easier by providing partial content even when requests are interrupted
2025-09-01 01:57:45 +02:00
Mario Zechner
7db3068cee fix(ai): Fix OpenAI Responses provider import order and cost calculation 2025-08-31 23:56:39 +02:00
Mario Zechner
a132b8140c feat(ai): Add start event emission to all providers
- Emit start event with model and provider info after creating stream
- Add abort signal tests for all providers
- Update README abort signal section to reflect non-throwing API
- Fix model references in README examples
2025-08-31 23:09:14 +02:00
Mario Zechner
ee4c131873 fix(ai): Fix OpenAI Responses provider multi-turn conversation support
- Collect complete output items during streaming instead of building blocks incrementally
- Handle reasoning summary parts with proper newline separation
- Support refusal content in message outputs
- Preserve full reasoning items and message IDs for multi-turn resubmission
- Emit proper streaming events for text and thinking deltas
2025-08-31 22:11:08 +02:00
Mario Zechner
a72e6d08d4 refactor(ai): Update OpenAI Completions provider to new content block API 2025-08-31 20:59:57 +02:00
Mario Zechner
7c8cdacc09 refactor(ai): Simplify Google provider with cleaner block handling 2025-08-31 20:31:08 +02:00
Mario Zechner
f29752ac82 refactor(ai): Update API to support multiple thinking and text blocks
BREAKING CHANGE: AssistantMessage now uses content array instead of separate fields
- Changed AssistantMessage.content from string to array of content blocks
- Removed separate thinking, toolCalls, and signature fields
- Content blocks can be TextContent, ThinkingContent, or ToolCall types
- Updated streaming events to include start/end events for text and thinking
- Fixed multiTurn test to handle new content structure

Note: Currently only Anthropic provider is updated to work with new API
Other providers need to be updated to match the new interface
2025-08-31 19:32:12 +02:00
Mario Zechner
cff766d3e2 fix(ai): Fix OpenAI Responses provider multi-turn conversation support
- Added contentSignature tracking for assistant messages
- Fixed message format in convertToResponsesFormat (output_text instead of input_text)
- Properly preserve message IDs for multi-turn conversations
- Added proper ResponseOutputMessage type satisfaction
- Updated tests to cover more providers and multi-turn scenarios
2025-08-30 22:55:29 +02:00
Mario Zechner
2e90f8f8bc feat(ai): Enable browser support for OpenAI providers
- Added dangerouslyAllowBrowser: true to OpenAI client initialization
- Allows usage of the library in browser environments
- Applies to both OpenAICompletionsLLM and OpenAIResponsesLLM
2025-08-30 22:29:14 +02:00
Mario Zechner
796e48b80e feat(ai): Add image input tests for vision-capable models
- Added image tests to OpenAI Completions (gpt-4o-mini)
- Added image tests to Anthropic (claude-sonnet-4-0)
- Added image tests to Google (gemini-2.5-flash)
- Tests verify models can process and describe the red circle test image
2025-08-30 18:37:17 +02:00
Mario Zechner
545d04fc5c fix(ai): Fix OpenAI completions store field type issue
- Cast params to any for store field assignment
- Maintains compatibility with providers that don't support store field
2025-08-30 01:08:08 +02:00
Mario Zechner
550da5e47c feat(ai): Add cost tracking to LLM implementations
- Track input/output token costs for all providers
- Calculate costs based on Model pricing information
- Include cost information in AssistantMessage responses
- Add Usage interface with detailed cost breakdown
- Implement calculateCost utility function for cost calculations
2025-08-30 00:45:08 +02:00
Mario Zechner
f9d688d577 refactor(ai): Update LLM implementations to use Model objects
- LLM constructors now take Model objects instead of string IDs
- Added provider field to AssistantMessage interface
- Updated getModel function with type-safe model ID autocomplete
- Fixed Anthropic model ID mapping for proper API aliases
- Added baseUrl to Model interface for provider-specific endpoints
- Updated all tests to use getModel for model instantiation
- Removed deprecated models.json in favor of generated models
2025-08-30 00:21:03 +02:00
Mario Zechner
c7618db3f7 refactor(ai): Implement unified model system with type-safe createLLM
- Add Model interface to types.ts with normalized structure
- Create type-safe generic createLLM function with provider-specific model constraints
- Generate models from OpenRouter API and models.dev data
- Strip provider prefixes for direct providers (google, openai, anthropic, xai)
- Keep full model IDs for OpenRouter-proxied models
- Clean separation: types.ts (Model interface), models.ts (factory logic), models.generated.ts (data)
- Remove old model scripts and unused dependencies
- Rename GeminiLLM to GoogleLLM for consistency
- Add tests for new providers (xAI, Groq, Cerebras, OpenRouter)
- Support 181 tool-capable models across 7 providers with full type safety
2025-08-29 23:19:47 +02:00
Mario Zechner
3f36051bc6 feat(ai): Migrate tests to Vitest and add provider test coverage
- Switch from Node.js test runner to Vitest for better DX
- Add test suites for Grok, Groq, Cerebras, and OpenRouter providers
- Add Ollama test suite with automatic server lifecycle management
- Include thinking mode and multi-turn tests for all providers
- Remove example files (consolidated into test suite)
- Add VS Code test configuration
2025-08-29 21:32:45 +02:00
Mario Zechner
4bb3a5ad02 feat(ai): Add OpenAI-compatible provider examples for multiple services
- Add examples for Cerebras, Groq, Ollama, and OpenRouter
- Update OpenAI Completions provider to handle base URL properly
- Simplify README formatting
- All examples use the same OpenAICompletionsLLM provider with different base URLs
2025-08-25 17:41:47 +02:00
Mario Zechner
7a6852081d test(ai): Add comprehensive E2E tests for all AI providers
- Add multi-turn test to verify thinking and tool calling work together
- Test thinkingSignature handling for proper multi-turn context
- Fix Gemini provider to generate base64 thinkingSignature when needed
- Handle multiple rounds of tool calls in tests (Gemini behavior)
- Make thinking tests more robust for model-dependent behavior
- All 18 tests passing across 4 providers
2025-08-25 15:54:26 +02:00
Mario Zechner
289e60ab88 fix(ai): Correct Gemini thinking config structure
- Fixed thinkingConfig to be at top level, not nested under 'config'
- Matches Gemini API documentation structure
2025-08-25 10:33:58 +02:00
Mario Zechner
3e1422d3d7 feat(ai): Add proper thinking support for Gemini 2.5 models
- Added thinkingConfig with includeThoughts and thinkingBudget support
- Use part.thought boolean flag to detect thinking content per API docs
- Capture and preserve thought signatures for multi-turn function calling
- Added supportsThinking() check for Gemini 2.5 series models
- Updated example to demonstrate thinking configuration
- Handle SDK type limitations with proper type assertions
2025-08-25 10:26:23 +02:00
Mario Zechner
a8ba19f0b4 feat(ai): Implement Gemini provider with streaming and tool support
- Added GeminiLLM provider implementation with GoogleGenerativeAI SDK
- Supports streaming with text/thinking content and completion signals
- Handles Gemini's parts-based content system (text, thought, functionCall)
- Implements tool/function calling with proper format conversion
- Maps between unified types and Gemini-specific formats (model vs assistant role)
- Added test example matching other provider patterns
- Fixed typo in AssistantMessage type (stopResaon -> stopReason) across all providers
2025-08-24 20:41:32 +02:00
Mario Zechner
cb4c32faaa refactor(ai): Add completion signal to onText/onThinking callbacks
- Update LLMOptions interface to include completion boolean parameter
- Modify all providers to signal when text/thinking blocks are complete
- Update examples to handle the completion parameter
- Move documentation files to docs/ directory
2025-08-24 20:33:26 +02:00
Mario Zechner
8364ecde4a feat(ai): Add OpenAI Completions and Responses API providers
- Implement OpenAICompletionsLLM for Chat Completions API with streaming
- Implement OpenAIResponsesLLM for Responses API with reasoning support
- Update types to use LLM/Context instead of AI/Request
- Add support for reasoning tokens, tool calls, and streaming
- Create test examples for both OpenAI providers
- Update Anthropic provider to match new interface
2025-08-24 20:18:10 +02:00
Mario Zechner
e5aedfed29 feat(ai): Implement unified AI API with Anthropic provider
- Define clean API with complete() method and callbacks for streaming
- Add comprehensive type system for messages, tools, and usage
- Implement AnthropicAI provider with full feature support:
  - Thinking/reasoning with signatures
  - Tool calling with parallel execution
  - Streaming via callbacks (onText, onThinking)
  - Proper error handling and stop reasons
  - Cache tracking for input/output tokens
- Add working test/example demonstrating tool execution flow
- Support for system prompts, temperature, max tokens
- Proper message role types: user, assistant, toolResult
2025-08-17 23:30:20 +02:00