Koffi is only used on Windows for VT input support and fails to build
on Termux/Android and Linux systems without build tools. Moving it to
optionalDependencies allows installation to succeed on all platforms
while maintaining Windows functionality.
strip-ansi is imported in bash-executor.ts and two interactive mode
components but is not declared in package.json dependencies. This
causes ERR_MODULE_NOT_FOUND at runtime in strict package managers
like pnpm that do not hoist undeclared dependencies.
Co-authored-by: Graadient <graadient@users.noreply.github.com>
createBranchedSession() wrote the file and set flushed=true even when the
branched path had no assistant message. The next _persist() call saw no
assistant, reset flushed=false, and the subsequent flush appended all
in-memory entries to the already-populated file, duplicating the header
and entries.
Fix: defer file creation when the branched path has no assistant message,
matching the newSession() contract. _persist() creates the file on the
first assistant response.
closes#1672
- Map redacted_thinking to ThinkingContent with redacted: true instead of
adding a new content type. The opaque payload goes in thinkingSignature,
thinking text is set to "[Reasoning redacted]" so it renders naturally
everywhere. Cross-model transform drops redacted blocks.
- Skip interleaved-thinking-2025-05-14 beta header for Opus 4.6 / Sonnet 4.6
where adaptive thinking makes it deprecated/redundant.
- Do not send temperature when thinkingEnabled is true (incompatible with
both adaptive and budget-based thinking).
Based on #1665 by @tctev
Z.ai uses the same enable_thinking: boolean parameter as Qwen to control reasoning, not thinking: { type: "enabled" | "disabled" }.
The wrong parameter name means Z.ai ignores the disable request and always runs with thinking enabled, wasting tokens and adding latency.
Merge the Z.ai and Qwen branches since they use the same format.
PR by @okuyam2y
`hasVertexAdcCredentials()` uses dynamic imports to load `node:fs`,
`node:os`, and `node:path` to avoid breaking browser/Vite builds. These
imports are fired eagerly but resolve asynchronously. If the function is
called during gateway startup before those promises resolve, `_existsSync`,
`_homedir`, and `_join` are still null — causing the function to cache
`false` permanently and never re-evaluate.
This means users with valid `GOOGLE_APPLICATION_CREDENTIALS`,
`GOOGLE_CLOUD_PROJECT`, and `GOOGLE_CLOUD_LOCATION` configured are silently
treated as unauthenticated for Vertex AI. Calls fall back to the AI Studio
endpoint (generativelanguage.googleapis.com) which has much stricter rate
limits, causing unexpected 429 errors even though Vertex credentials are
correctly configured.
Fix: in Node.js/Bun environments, return false without caching when the
async modules aren't loaded yet, so the next call retries. Only cache false
permanently in browser environments where `fs` is genuinely unavailable.
Co-authored-by: Jeremiah Gaylord <jeremiahgaylord-web@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Add Gemini 3.1 Pro Preview model to the Cloud Code Assist (google-gemini-cli)
provider for parity with the google and google-vertex providers that already
include this model.
Tested and confirmed working via the Cloud Code Assist API endpoint.