Commit graph

155 commits

Author SHA1 Message Date
Mario Zechner
09d409cc92 Fix z.ai thinking/reasoning params, fixes #688
Z.ai uses thinking: { type: "enabled" | "disabled" } instead of
OpenAI's reasoning_effort. Added thinkingFormat compat flag to handle
this. Thinking is now explicitly enabled/disabled based on user setting.
2026-01-13 18:34:07 +01:00
Markus Ylisiurunen
00ba005e50
set the prompt cache key to session id (#698) 2026-01-13 18:29:36 +01:00
Mario Zechner
28072cb31f Add more models to stream.test.ts for Vercel, set infinite timeout on OpenAI responses, closes #690 2026-01-13 17:08:56 +01:00
Mario Zechner
3c60ffa677 Fix tool call ID normalization for cross-provider switches to Anthropic/GitHub Copilot 2026-01-13 04:07:10 +01:00
Mario Zechner
8af8d0d672 Add MiniMax provider support (#656 by @dannote)
- Add minimax to KnownProvider and Api types
- Add MINIMAX_API_KEY to getEnvApiKey()
- Generate MiniMax-M2 and MiniMax-M2.1 models
- Add context overflow detection pattern
- Add tests to all required test files
- Update README and CHANGELOG with attribution

Also fixes:
- Bedrock duplicate toolResult ID when content has multiple blocks
- Sandbox extension unused parameter lint warning
2026-01-13 02:27:09 +01:00
Ahmed Kamal
ff15414258
Improve Gemini CLI provider retries and headers (#670)
Improve Gemini CLI provider retries and headers

- Add Antigravity endpoint fallback (tries daily sandbox then prod when baseUrl is unset)
- Parse retry delays from headers (Retry-After, x-ratelimit-reset, x-ratelimit-reset-after) before body parsing
- Derive stable sessionId from first user message for cache affinity
- Retry empty SSE streams with backoff without duplicate start/done events
- Add anthropic-beta header for Claude thinking models only
2026-01-13 01:04:53 +01:00
Danila Poyarkov
9e4ae98358
Improve Google Cloud Code Assist error handling (#665)
* Improve Cloud Code Assist error messages

- Extract just the message from verbose JSON error responses
- Extract cause from generic 'fetch failed' errors for better diagnostics

* Make 'other side closed' network error retryable

* Make 'other side closed' network error retryable
2026-01-13 00:41:20 +01:00
Mario Zechner
d442bbcc19 feat(ai): Add prompt caching for Claude models on Bedrock
Adds cache points to system prompt and last user message for:
- Claude 3.5 Haiku
- Claude 3.7 Sonnet
- Claude 4.x models (Opus, Sonnet, Haiku)

Uses Bedrock's cachePoint blocks with 5-minute TTL.
2026-01-13 00:38:12 +01:00
Mario Zechner
fd268479a4 feat(ai): Add Amazon Bedrock provider (#494)
Adds support for Amazon Bedrock with Claude models including:
- Full streaming support via Converse API
- Reasoning/thinking support for Claude models
- Cross-region inference model ID handling
- Multiple AWS credential sources (profile, IAM keys, API keys)
- Image support in messages and tool results
- Unicode surrogate sanitization

Also adds 'Adding a New Provider' documentation to AGENTS.md and README.

Co-authored-by: nickchan2 <nickchan2@users.noreply.github.com>
2026-01-13 00:32:59 +01:00
Markus Ylisiurunen
4f216d318f
Apply service tier pricing (#675) 2026-01-12 23:56:51 +01:00
nathyong
7b2c627079
Insert cache point on openrouter+anthropic completions (#584)
Co-authored-by: nathyong <nathyong@noreply.github.com>
2026-01-12 23:29:33 +01:00
Markus Ylisiurunen
7b79e8ec51
Add service tier option for OpenAI Responses API (#672)
* add service tier option for OpenAI responses

* add serviceTier option for OpenAI Responses requests
2026-01-12 23:20:18 +01:00
Mario Zechner
0138eee6f7 Fix tool mapping 2026-01-12 17:56:13 +01:00
Danila Poyarkov
7a41975e9e
Fix Claude via Google APIs requiring tool call IDs (#653)
Claude models accessed through Google Cloud Code Assist API require
explicit id fields in both functionCall and functionResponse parts.
Without these IDs, the API returns 'tool_use.id: Field required' error.

Add requiresToolCallId() helper to centralize the Claude model detection
and include IDs in both tool call and tool result message conversions.
2026-01-12 16:40:07 +01:00
Danila Poyarkov
934e7e470b
Avoid cross-provider thought signatures (#654)
* Avoid cross-provider thought signatures

* Fix Google thought signature replay

Filter thought signatures to same provider with base64 validation and rename the transform helper for clarity.
2026-01-12 16:38:53 +01:00
theBucky
a315cfe813 fix(ai): complete textSignature round-trip for Google providers
- Store thoughtSignature on text blocks during streaming (all 3 providers)
- Replay textSignature as thoughtSignature in convertMessages
- Remove redundant conditional since retainThoughtSignature handles undefined

Per Google docs, text part signatures are optional but recommended for
high-quality reasoning in multi-turn conversations.
2026-01-11 19:25:38 +01:00
theBucky
4f757fbe23 fix(ai): correct Google thinking detection and remove unsupported id fields
- isThinkingPart now only checks thought === true, not thoughtSignature
- thoughtSignature is for context replay and can appear on any part type
- Store thoughtSignature on text blocks as textSignature for proper replay
- Remove id from functionCall/functionResponse (unsupported by Vertex/Cloud Code Assist)

Refs: https://ai.google.dev/gemini-api/docs/thought-signatures
Co-authored-by: Amp <amp@ampcode.com>
2026-01-11 19:25:38 +01:00
Mario Zechner
ec83d91473 fix(ai): resolve OAuth tool names via context 2026-01-10 13:45:08 +01:00
Mario Zechner
6dcb64565a Prepare for alternative Codex harness certification 2026-01-10 13:22:10 +01:00
Mario Zechner
14be8efba8 Merge PR #596: Add supportsUsageInStreaming compat flag 2026-01-10 00:34:29 +01:00
Mario Zechner
52ce113754 Add supportsUsageInStreaming compat flag for OpenAI-compatible providers
Renamed from supportsStreamOptions to clarify this controls stream_options: { include_usage: true }.
Defaults to true (no behavioral change for existing providers).
Providers like gatewayz.ai that reject this parameter can set supportsUsageInStreaming: false in model config.

Based on #596 by @XesGaDeus
2026-01-10 00:34:06 +01:00
Mario Zechner
a613306e11 fix(ai): disable strict mode for OpenAI completions tool schemas
OpenRouter with models like openai/gpt-5.2 enforces strict mode which
requires all properties in the required array. Setting strict: false
allows optional parameters without null unions, matching the approach
already used in openai-responses.ts.
2026-01-10 00:19:02 +01:00
Mario Zechner
fe98895706 Better error messages on OpenRouter via openai-completions 2026-01-10 00:09:51 +01:00
Mario Zechner
35690f6d1a Merge branch 'fix/lazy-homedir-env-first' 2026-01-09 22:09:27 +01:00
gnattu
58b903690b
Set strict parameter to false in OpenAI response mapping (#598)
lm-studio hosted openai-like api endpoint requires this parameter to either be a defined boolean, or not specifying this option entirely. null will fail the API validation.
2026-01-09 20:32:58 +01:00
Mario Zechner
60f5a03576 Add [Unreleased] section for next cycle 2026-01-09 20:24:50 +01:00
Helmut Januschka
b4351040a7
pi pi pi pew (#594) 2026-01-09 12:43:00 +01:00
xes garcia
732d46123b fix for gatewayz provider 2026-01-09 10:58:05 +01:00
jhyang
d2882c2643 Resolve os.homedir() lazily instead of at module load time
- Move homedir() calls into functions for lazy evaluation
- Add GOOGLE_APPLICATION_CREDENTIALS support for Vertex AI
2026-01-09 16:09:54 +08:00
Mario Zechner
f745321169 Clean-up. 2026-01-09 05:23:08 +01:00
Mario Zechner
f5e6bcac1b Remove Anthropic OAuth support 2026-01-09 05:10:33 +01:00
Mario Zechner
ef7c52ffa1 chore: fix template literal lint, update AGENTS.md to require fixing all check output 2026-01-08 23:44:26 +01:00
Mario Zechner
16e142ef7d fix(ai): remove <thinking> tag wrapping, convert to plain text on cross-model handoff
- Remove <thinking> tag generation from google-shared.ts, transorm-messages.ts, openai-completions.ts
- Thinking blocks now convert to plain text when switching models (prevents models mimicking tags)
- Skip empty thinking blocks to avoid API errors
- Keep thinking blocks only when same provider AND same model

fixes #561
2026-01-08 21:19:16 +01:00
Mario Zechner
aa89080ea0 fix(ai): add bridge prompt to override Antigravity behavior with Pi defaults 2026-01-08 20:20:24 +01:00
Mario Zechner
31f155d7db
Merge pull request #571 from ben-vargas/fix-antigravity-patch
fix(ai): align antigravity request payload
2026-01-08 20:00:40 +01:00
Ben Vargas
74476be61d fix(ai): align antigravity request payload 2026-01-08 10:00:44 -07:00
Thomas Mustier
a65da1c14b fix: ESC key not interrupting during Working... state
Three related fixes:

1. google-gemini-cli: Handle abort signal in stream reading loop
   - Add abort event listener to cancel reader immediately when signal fires
   - Fix AbortError detection in retry catch block (fetch throws AbortError,
     not our custom message)
   - Swallow reader.cancel() rejection to avoid unhandled promise

2. agent-session: Fix retry attempt counter showing 0 on cancel
   - abortRetry() was resetting _retryAttempt before the catch block could
     read it for the error message

3. interactive-mode: Restore main escape handler on agent_start
   - When auto-retry starts, onEscape is replaced with retry-specific handler
   - auto_retry_end (which restores it) fires on turn_end, after streaming begins
   - Now restore immediately on agent_start if retry handler is still active

Amended: suppress reader.cancel() rejection on abort.
2026-01-08 12:35:34 +00:00
Thomas Mustier
6052453f4f fix(ai): improve codex stream error details 2026-01-07 22:44:22 +00:00
Zhou Rui
d893ba7f20
fix(ai): clean up openai-codex models and token limits 2026-01-07 23:09:20 +08:00
Mario Zechner
03e3f0d801
Merge pull request #510 from mitsuhiko/annotate-bridge-prompt
Annotate bridge prompt
2026-01-06 23:47:02 +01:00
Ahmed Kamal
e42e9e6305 fix(ai): classify Google thoughtSignature as thinking
Google streaming may emit thoughtSignature without thought=true (including empty-text signature-only parts). Treat non-empty thoughtSignature as thinking to avoid leaking reasoning into normal text and retain signature across streaming deltas. Add unit test coverage.
2026-01-06 20:47:19 +02:00
Armin Ronacher
6a5f04ce1f Add the codex bridge prompt in the html export 2026-01-06 14:21:34 +01:00
Mario Zechner
edb0da9611 feat(ai,agent,coding-agent): add sessionId for provider session-based caching
- Add sessionId to StreamOptions for providers that support session-based caching
- OpenAI Codex provider uses sessionId for prompt_cache_key and routing headers
- Agent class now accepts and forwards sessionId to stream functions
- coding-agent passes session ID from SessionManager and updates on session changes
- Update ai package README with table of contents, OpenAI Codex OAuth docs, and env vars table
- Increase Codex instructions cache TTL from 15 minutes to 24 hours
- Add tests for sessionId forwarding in ai and agent packages
2026-01-06 11:08:42 +01:00
Mario Zechner
858c6bae8a refactor(ai): streamline codex prompt handling 2026-01-06 10:27:51 +01:00
Ahmed Kamal
47402ddaf7
fix(ai): always include reasoning.encrypted_content for codex (#484) 2026-01-06 00:50:58 +01:00
Ben Vargas
02b72b49d5 fix: codex thinking handling 2026-01-05 21:55:47 +01:00
Mario Zechner
bb50738f7e fix(ai): append system prompt to codex bridge message instead of converting to input
Previously the system prompt was converted to an input message in convertMessages,
then stripped out by filterPiSystemPrompts. Now the system prompt is passed directly
to transformRequestBody and appended after CODEX_PI_BRIDGE in the bridge message.
2026-01-05 06:03:07 +01:00
Mario Zechner
9a147559c0 Merge branch 'openai-codex' 2026-01-05 05:33:48 +01:00
Ahmed Kamal
1650041a63 feat(ai): add OpenAI Codex OAuth + responses provider 2026-01-04 21:11:19 +02:00
butelo
36e774282d
fix duplicated thinking tokens in chutes (#443)
Co-authored-by: xes garcia <xes.garcia@deus.ai>
2026-01-04 18:12:09 +01:00