- Collect complete output items during streaming instead of building blocks incrementally
- Handle reasoning summary parts with proper newline separation
- Support refusal content in message outputs
- Preserve full reasoning items and message IDs for multi-turn resubmission
- Emit proper streaming events for text and thinking deltas
BREAKING CHANGE: AssistantMessage now uses content array instead of separate fields
- Changed AssistantMessage.content from string to array of content blocks
- Removed separate thinking, toolCalls, and signature fields
- Content blocks can be TextContent, ThinkingContent, or ToolCall types
- Updated streaming events to include start/end events for text and thinking
- Fixed multiTurn test to handle new content structure
Note: Currently only Anthropic provider is updated to work with new API
Other providers need to be updated to match the new interface
- Added contentSignature tracking for assistant messages
- Fixed message format in convertToResponsesFormat (output_text instead of input_text)
- Properly preserve message IDs for multi-turn conversations
- Added proper ResponseOutputMessage type satisfaction
- Updated tests to cover more providers and multi-turn scenarios
- Added image tests to OpenAI Completions (gpt-4o-mini)
- Added image tests to Anthropic (claude-sonnet-4-0)
- Added image tests to Google (gemini-2.5-flash)
- Tests verify models can process and describe the red circle test image
- LLM constructors now take Model objects instead of string IDs
- Added provider field to AssistantMessage interface
- Updated getModel function with type-safe model ID autocomplete
- Fixed Anthropic model ID mapping for proper API aliases
- Added baseUrl to Model interface for provider-specific endpoints
- Updated all tests to use getModel for model instantiation
- Removed deprecated models.json in favor of generated models
- Add proper Anthropic model aliases (claude-opus-4-1, claude-sonnet-4-0, etc.)
- Deduplicate models when same ID appears in both models.dev and OpenRouter
- models.dev takes priority over OpenRouter for duplicate IDs
- Fix test to use correct claude-3-5-haiku-latest alias
- Reduces Anthropic models from 11 to 10 (removed duplicate)
- Add Model interface to types.ts with normalized structure
- Create type-safe generic createLLM function with provider-specific model constraints
- Generate models from OpenRouter API and models.dev data
- Strip provider prefixes for direct providers (google, openai, anthropic, xai)
- Keep full model IDs for OpenRouter-proxied models
- Clean separation: types.ts (Model interface), models.ts (factory logic), models.generated.ts (data)
- Remove old model scripts and unused dependencies
- Rename GeminiLLM to GoogleLLM for consistency
- Add tests for new providers (xAI, Groq, Cerebras, OpenRouter)
- Support 181 tool-capable models across 7 providers with full type safety
- Switch from Node.js test runner to Vitest for better DX
- Add test suites for Grok, Groq, Cerebras, and OpenRouter providers
- Add Ollama test suite with automatic server lifecycle management
- Include thinking mode and multi-turn tests for all providers
- Remove example files (consolidated into test suite)
- Add VS Code test configuration
- Add multi-turn test to verify thinking and tool calling work together
- Test thinkingSignature handling for proper multi-turn context
- Fix Gemini provider to generate base64 thinkingSignature when needed
- Handle multiple rounds of tool calls in tests (Gemini behavior)
- Make thinking tests more robust for model-dependent behavior
- All 18 tests passing across 4 providers