- Map redacted_thinking to ThinkingContent with redacted: true instead of
adding a new content type. The opaque payload goes in thinkingSignature,
thinking text is set to "[Reasoning redacted]" so it renders naturally
everywhere. Cross-model transform drops redacted blocks.
- Skip interleaved-thinking-2025-05-14 beta header for Opus 4.6 / Sonnet 4.6
where adaptive thinking makes it deprecated/redundant.
- Do not send temperature when thinkingEnabled is true (incompatible with
both adaptive and budget-based thinking).
Based on #1665 by @tctev
Z.ai uses the same enable_thinking: boolean parameter as Qwen to control reasoning, not thinking: { type: "enabled" | "disabled" }.
The wrong parameter name means Z.ai ignores the disable request and always runs with thinking enabled, wasting tokens and adding latency.
Merge the Z.ai and Qwen branches since they use the same format.
PR by @okuyam2y
`hasVertexAdcCredentials()` uses dynamic imports to load `node:fs`,
`node:os`, and `node:path` to avoid breaking browser/Vite builds. These
imports are fired eagerly but resolve asynchronously. If the function is
called during gateway startup before those promises resolve, `_existsSync`,
`_homedir`, and `_join` are still null — causing the function to cache
`false` permanently and never re-evaluate.
This means users with valid `GOOGLE_APPLICATION_CREDENTIALS`,
`GOOGLE_CLOUD_PROJECT`, and `GOOGLE_CLOUD_LOCATION` configured are silently
treated as unauthenticated for Vertex AI. Calls fall back to the AI Studio
endpoint (generativelanguage.googleapis.com) which has much stricter rate
limits, causing unexpected 429 errors even though Vertex credentials are
correctly configured.
Fix: in Node.js/Bun environments, return false without caching when the
async modules aren't loaded yet, so the next call retries. Only cache false
permanently in browser environments where `fs` is genuinely unavailable.
Co-authored-by: Jeremiah Gaylord <jeremiahgaylord-web@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Gemini 3.1 Pro Preview model to the Cloud Code Assist (google-gemini-cli)
provider for parity with the google and google-vertex providers that already
include this model.
Tested and confirmed working via the Cloud Code Assist API endpoint.
Add metadata?: Record<string, unknown> to StreamOptions so providers
can extract fields they understand. Anthropic provider extracts user_id
for abuse tracking and rate limiting. Other providers ignore it.
Based on #1384 by @7Sageer, reworked to use a generic type instead of
Anthropic-specific typing on the base interface.
* extend interleaved thinking test to Anthropic first-party provider
* switch back to global Bedrock model identifier
* set retry to 3 for both
* enable bedrock claude interleaved thinking by default and use completeSimple in test
- Replace verbose ANTIGRAVITY_SYSTEM_INSTRUCTION with compact version from CLIProxyAPI
- Replace bridgePrompt override with [ignore] wrapper pattern
- Switch Antigravity Gemini test model from gemini-3-flash to gemini-3-pro-high
- Rename calculator tool to math_operation (gemini-3-pro ignores schema for 'calculator')
closes#1415
- Use parametersJsonSchema instead of parameters for Gemini tool declarations
to support full JSON Schema (anyOf, oneOf, const, etc.)
- Keep legacy parameters field for Claude models on Cloud Code Assist, where
the API translates parameters into Anthropic's input_schema
- Revert claude-opus-4-6-thinking back to claude-opus-4-5-thinking (model
doesn't exist on the Antigravity endpoint)
fixes#1398
* feat(antigravity): update model costs and tokens, migrate compaction tests to Claude 4.6
- Update model costs for various models in `models.generated.ts` to reflect the latest pricing.
- Update maxTokens for certain models.
- Migrate compaction tests from `claude-opus-4-5-thinking` to `claude-opus-4-6-thinking` in `compaction-thinking-model.test.ts`.
* fix: remove unnecessary peer dependencies in package-lock.json
This commit removes the `peer: true` flag from several dependencies in `package-lock.json`. These dependencies (lit, tailwind-merge, tailwindcss, picomatch, @tailwindcss/typescript, vite, picomatch) do not need to be explicitly marked as peer dependencies in this context, as they are already managed as regular dependencies. Removing the `peer: true` flag simplifies the dependency graph and avoids potential conflicts during installation.
Add a manually inserted "auto" model entry for OpenRouter alongside
the existing "openrouter/auto" entry, allowing users to select the
auto-routing model with a shorter identifier.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Claude Opus 4.5 has been replaced by Claude Opus 4.6 on the
Antigravity (Google Cloud Code Assist) platform.
- Update model definition in generate-models.ts
- Update generated models output