Changelog

[Unreleased]

Anthropic SDK retries disabled: Set maxRetries: 0 on Anthropic client to allow application-level retry handling. The SDK's built-in retries were interfering with coding-agent's retry logic. (#157)

Mistral provider: Added support for Mistral AI models via the OpenAI-compatible API. Includes automatic handling of Mistral-specific requirements (tool call ID format). Set MISTRAL_API_KEY environment variable to use.

Fixed Mistral 400 errors after aborted assistant messages by skipping empty assistant messages (no content, no tool calls) (#165)
Removed synthetic assistant bridge message after tool results for Mistral (no longer required as of Dec 2025) (#165)
Fixed bug where ANTHROPIC_API_KEY environment variable was deleted globally after first OAuth token usage, causing subsequent prompts to fail (#164)

agentLoopContinue function: Continue an agent loop from existing context without adding a new user message. Validates that the last message is user or toolResult. Useful for retry after context overflow or resuming from manually-added tool results.

Removed provider-level tool argument validation. Validation now happens in agentLoop via executeToolCalls, allowing models to retry on validation errors. For manual tool execution, use validateToolCall(tools, toolCall) or validateToolArguments(tool, toolCall).

Added validateToolCall(tools, toolCall) helper that finds the tool by name and validates arguments.
OpenAI compatibility overrides: Added compat field to Model for openai-completions API, allowing explicit configuration of provider quirks (supportsStore, supportsDeveloperRole, supportsReasoningEffort, maxTokensField). Falls back to URL-based detection if not set. Useful for LiteLLM, custom proxies, and other non-standard endpoints. (#133, thanks @fink-andreas for the initial idea and PR)
xhigh reasoning level: Added xhigh to ReasoningEffort type for OpenAI codex-max models. For non-OpenAI providers (Anthropic, Google), xhigh is automatically mapped to high. (#143)

Updated SDK versions: OpenAI SDK 5.21.0 → 6.10.0, Anthropic SDK 0.61.0 → 0.71.2, Google GenAI SDK 1.30.0 → 1.31.0

Added totalTokens field to Usage type: All code that constructs Usage objects must now include the totalTokens field. This field represents the total tokens processed by the LLM (input + output + cache). For OpenAI and Google, this uses native API values (total_tokens, totalTokenCount). For Anthropic, it's computed as input + output + cacheRead + cacheWrite.

OpenAI Token Counting: Fixed usage.input to exclude cached tokens for OpenAI providers. Previously, input included cached tokens, causing double-counting when calculating total context size via input + cacheRead. Now input represents non-cached input tokens across all providers, making input + output + cacheRead + cacheWrite the correct formula for total context size.
Fixed Claude Opus 4.5 cache pricing (was 3x too expensive)
- Corrected cache_read: $1.50 → $0.50 per MTok
- Corrected cache_write: $18.75 → $6.25 per MTok
- Added manual override in scripts/generate-models.ts until upstream fix is merged
- Submitted PR to models.dev: https://github.com/sst/models.dev/pull/439

Initial release with multi-provider LLM support.