- Remove <thinking> tag generation from google-shared.ts, transorm-messages.ts, openai-completions.ts
- Thinking blocks now convert to plain text when switching models (prevents models mimicking tags)
- Skip empty thinking blocks to avoid API errors
- Keep thinking blocks only when same provider AND same model
fixes#561
Three related fixes:
1. google-gemini-cli: Handle abort signal in stream reading loop
- Add abort event listener to cancel reader immediately when signal fires
- Fix AbortError detection in retry catch block (fetch throws AbortError,
not our custom message)
- Swallow reader.cancel() rejection to avoid unhandled promise
2. agent-session: Fix retry attempt counter showing 0 on cancel
- abortRetry() was resetting _retryAttempt before the catch block could
read it for the error message
3. interactive-mode: Restore main escape handler on agent_start
- When auto-retry starts, onEscape is replaced with retry-specific handler
- auto_retry_end (which restores it) fires on turn_end, after streaming begins
- Now restore immediately on agent_start if retry handler is still active
Amended: suppress reader.cancel() rejection on abort.
Google streaming may emit thoughtSignature without thought=true (including empty-text signature-only parts). Treat non-empty thoughtSignature as thinking to avoid leaking reasoning into normal text and retain signature across streaming deltas. Add unit test coverage.
- Add sessionId to StreamOptions for providers that support session-based caching
- OpenAI Codex provider uses sessionId for prompt_cache_key and routing headers
- Agent class now accepts and forwards sessionId to stream functions
- coding-agent passes session ID from SessionManager and updates on session changes
- Update ai package README with table of contents, OpenAI Codex OAuth docs, and env vars table
- Increase Codex instructions cache TTL from 15 minutes to 24 hours
- Add tests for sessionId forwarding in ai and agent packages
Previously the system prompt was converted to an input message in convertMessages,
then stripped out by filterPiSystemPrompts. Now the system prompt is passed directly
to transformRequestBody and appended after CODEX_PI_BRIDGE in the bridge message.
- Implement google-vertex provider in packages/ai
- Support ADC (Application Default Credentials) via @google/generative-ai
- Add Gemini model catalog for Vertex AI
- Update packages/coding-agent to handle google-vertex provider
- Add GoogleThinkingLevel type mirroring Google's ThinkingLevel enum
- Update GoogleGeminiCliOptions and GoogleOptions to use our type
- Cast to any when assigning to Google SDK's ThinkingConfig
- Migrate glm-4.5, glm-4.5-air, glm-4.5-flash, glm-4.6, glm-4.7 from anthropic-messages to openai-completions API
- Updated baseUrl from https://api.z.ai/api/anthropic to https://api.z.ai/api/coding/paas/v4
- Added compat setting to disable developer role for zai models
- Filter empty text blocks in openai-completions to avoid zai API validation errors
- Fixed zai provider tests to use OpenAI-style options (reasoningEffort)
When a user interrupts a tool call flow (sends a message without providing
tool results), APIs like OpenAI Responses and Anthropic fail because:
- OpenAI requires tool outputs for function calls
- OpenAI requires reasoning items to have their following items
- Anthropic requires non-empty content for error tool results
Instead of filtering out orphaned tool calls (which breaks thinking signatures),
we now insert synthetic empty tool results with isError: true and content
'No result provided'. This preserves the conversation structure and satisfies
all API requirements.
- Add OAuth handler with PKCE flow and local callback server
- Automatic project discovery via loadCodeAssist/onboardUser endpoints
- Store credentials with projectId for API calls
- Encode token+projectId as JSON for provider to decode
- Register as 'google-cloud-code-assist' OAuth provider
- Add new API type 'google-cloud-code-assist' for Gemini CLI / Antigravity auth
- Extract shared Google utilities to google-shared.ts
- Implement streaming provider for Cloud Code Assist endpoint
- Add 7 models: gemini-3-pro-high/low, gemini-3-flash, claude-sonnet/opus, gpt-oss
Models use OAuth authentication and have sh cost (uses Google account quota).
OAuth flow will be implemented in coding-agent in a follow-up.
Previously, when using 'google-generative-ai' API with a custom baseUrl
in models.json, the baseUrl was ignored and requests always went to the
default Google endpoint.
Now the provider correctly passes model.baseUrl to the SDK's
httpOptions.baseUrl, enabling use of custom endpoints or API proxies.
Fixes#216
- Fix tool result format for Gemini 3 Flash Preview compatibility
- Use 'output' key for successful results (not 'result')
- Use 'error' key for error results (not 'isError')
- Per Google SDK documentation for FunctionResponse.response
- Improve type safety in google.ts provider
- Add ImageContent import and use proper type guards
- Replace 'as any' casts with proper typing
- Import and use Schema type for tool parameters
- Add proper typing for index deletion in error handler
- Add comprehensive test for Gemini 3 Flash tool calling
- Tests successful tool call and result handling
- Tests error tool result handling
- Verifies fix for issue #213Fixes#213
* use the correct Gemini 3 Flash Preview thinking levels
* fix a build error
* add changelog entry
* regenerate models
* make less assumptions about future models