Commit graph

65 commits

Author SHA1 Message Date
Mario Zechner
c07277b9ac fix(ai): set opus 4.6 context window to 200k 2026-02-05 22:25:26 +01:00
Mario Zechner
712d0c6ada fix(ai,coding-agent): fix Bedrock Opus 4.6 model IDs, cache pricing, and add EU profile
- Remove :0 suffix from Opus 4.6 Bedrock model IDs (not valid for this model)
- Fix us/eu Opus 4.6 cache pricing (0.5/6.25 instead of 1.5/18.75)
- Add missing eu.anthropic.claude-opus-4-6-v1 inference profile
- Fix coding-agent default Bedrock model ID to match catalog
2026-02-05 22:21:22 +01:00
Mario Zechner
b94c17885d feat(ai): add Claude Opus 4.6 and GPT-5.3 Codex models 2026-02-05 20:34:56 +01:00
Mario Zechner
91e09765e7 chore(ai): add claude opus 4.6 model 2026-02-05 18:51:27 +01:00
Burak Varlı
be1d5a0299 chore(ai): clean up Bedrock-specific workarounds from generate-models.ts
We had some workarounds in `generate-models.ts` initially - mainly to make cross-region inference work
for Amazon Bedrock provider, but now these are upstreamed into models.dev and we no longer need those.
2026-02-03 22:04:49 +00:00
Mario Zechner
87ab5c5c3b feat(ai): add Kimi For Coding provider support
- Add kimi-coding provider using Anthropic Messages API
- API endpoint: https://api.kimi.com/coding/v1
- Environment variable: KIMI_API_KEY
- Models: kimi-k2-thinking (text), k2p5 (text + image)
- Add context overflow detection pattern for Kimi errors
- Add tests for all standard test suites
2026-01-29 04:12:28 +01:00
Mario Zechner
c808de605a feat(ai): add Hugging Face provider support
- Add huggingface to KnownProvider type
- Add HF_TOKEN env var mapping
- Process huggingface models from models.dev (14 models)
- Use openai-completions API with compat settings
- Add tests for all provider test suites
- Update documentation

fixes #994
2026-01-29 02:40:14 +01:00
Daniel Tatarkin
9f3eef65f8
fix(ai): filter deprecated OpenCode models from generation (#970)
Add status === 'deprecated' check for OpenCode Zen models, matching
the existing pattern used for GitHub Copilot models. This removes
deprecated models like glm-4.7-free and minimax-m2.1-free from the
generated model catalog.
2026-01-26 23:56:13 +01:00
Markus Ylisiurunen
856012296b add Azure OpenAI Responses provider with deployment-aware model mapping 2026-01-24 12:04:34 +01:00
Mario Zechner
0a7537bf86 Revert "feat(ai): add gpt-5.2-codex to OpenAI provider (#730)"
This reverts commit 5a795b9857.
2026-01-14 22:22:55 +01:00
Anton
5a795b9857
feat(ai): add gpt-5.2-codex to OpenAI provider (#730)
* feat(ai): add gpt-5.2-codex to OpenAI provider

* fix(ai): avoid build break when model generation misses providers
2026-01-14 22:21:01 +01:00
Jian Zhang
558a77b45f
feat(ai): add support for MiniMax China (minimax-cn) provider (#725)
Co-authored-by: Jian Zhang <jzhang@yanhuangdata.com>
2026-01-14 15:41:47 +01:00
Mario Zechner
09d409cc92 Fix z.ai thinking/reasoning params, fixes #688
Z.ai uses thinking: { type: "enabled" | "disabled" } instead of
OpenAI's reasoning_effort. Added thinkingFormat compat flag to handle
this. Thinking is now explicitly enabled/disabled based on user setting.
2026-01-13 18:34:07 +01:00
Timo Lins
65eb738c90 Rename to vercel-ai-gateway for clarity 2026-01-13 16:42:34 +01:00
Timo Lins
9860ee86f3 Change to Anthropic compatible API
It seemed as if the OpenAI message spec tried to send non-compliant messages with { text: "" } instead of { contet: "" }, which the AI Gateway did not accept.
2026-01-13 16:42:34 +01:00
Timo Lins
164a69a601 Add Vercel AI Gateway support 2026-01-13 16:42:34 +01:00
Markus Ylisiurunen
922b0a4668
add eu cross-region inference model ids for anthropic models (#685) 2026-01-13 13:02:27 +01:00
Mario Zechner
8af8d0d672 Add MiniMax provider support (#656 by @dannote)
- Add minimax to KnownProvider and Api types
- Add MINIMAX_API_KEY to getEnvApiKey()
- Generate MiniMax-M2 and MiniMax-M2.1 models
- Add context overflow detection pattern
- Add tests to all required test files
- Update README and CHANGELOG with attribution

Also fixes:
- Bedrock duplicate toolResult ID when content has multiple blocks
- Sandbox extension unused parameter lint warning
2026-01-13 02:27:09 +01:00
Mario Zechner
fd268479a4 feat(ai): Add Amazon Bedrock provider (#494)
Adds support for Amazon Bedrock with Claude models including:
- Full streaming support via Converse API
- Reasoning/thinking support for Claude models
- Cross-region inference model ID handling
- Multiple AWS credential sources (profile, IAM keys, API keys)
- Image support in messages and tool results
- Unicode surrogate sanitization

Also adds 'Adding a New Provider' documentation to AGENTS.md and README.

Co-authored-by: nickchan2 <nickchan2@users.noreply.github.com>
2026-01-13 00:32:59 +01:00
Mario Zechner
232749ee52 Remove copy-assets.js build step (prompt now bundled in .ts) 2026-01-10 13:26:46 +01:00
Aadish Verma
92eb6665fe
feat: add API pricing for antigravity models (#588) 2026-01-09 22:20:51 +01:00
Mario Zechner
97d0189eae Add OpenCode Zen provider support 2026-01-09 06:58:20 +01:00
Mario Zechner
39fa25eb67 fix(ai): clean up openai-codex models and token limits
- Remove model aliases (gpt-5, gpt-5-mini, gpt-5-nano, codex-mini-latest, gpt-5-codex, gpt-5.1-codex, gpt-5.1-chat-latest)
- Fix context window from 400k to 272k tokens to match Codex CLI defaults
- Keep maxTokens at 128k (original value)
- Simplify reasoning effort clamping

closes #536
2026-01-07 20:39:46 +01:00
Zhou Rui
d893ba7f20
fix(ai): clean up openai-codex models and token limits 2026-01-07 23:09:20 +08:00
Ben Vargas
e80a924292
fix: add accurate pricing for openai-codex OAuth models (#501)
Previously all openai-codex models had pricing set to 0, causing the
TUI to always show $0.00 for cost tracking.

Updated pricing based on OpenAI Standard tier rates:
- gpt-5.2/gpt-5.2-codex: $1.75/$14.00 per 1M tokens
- gpt-5.1/gpt-5.1-codex/gpt-5.1-codex-max: $1.25/$10.00 per 1M tokens
- gpt-5/gpt-5-codex: $1.25/$10.00 per 1M tokens
- codex-mini-latest: $1.50/$6.00 per 1M tokens
- gpt-5-mini/gpt-5.1-codex-mini/gpt-5-codex-mini: $0.25/$2.00 per 1M tokens
- gpt-5-nano: $0.05/$0.40 per 1M tokens

Source: https://platform.openai.com/docs/pricing
2026-01-06 17:45:09 +01:00
Mario Zechner
0b9e3ada0c fix: clean up Codex thinking level handling
- Remove per-thinking-level model variants (gpt-5.2-codex-high, etc.)
- Remove thinkingLevels from Model type
- Provider clamps reasoning effort internally
- Omit reasoning field when thinking is off

fixes #472
2026-01-05 21:58:26 +01:00
Ben Vargas
02b72b49d5 fix: codex thinking handling 2026-01-05 21:55:47 +01:00
Ahmed Kamal
1650041a63 feat(ai): add OpenAI Codex OAuth + responses provider 2026-01-04 21:11:19 +02:00
Anton Kuzmenko
6467e70995 fix: update cost values for input, output, and cacheRead in model configs 2026-01-03 01:11:03 +01:00
Anton Kuzmenko
214e7dae15 Add Vertex AI provider with ADC support
- Implement google-vertex provider in packages/ai
- Support ADC (Application Default Credentials) via @google/generative-ai
- Add Gemini model catalog for Vertex AI
- Update packages/coding-agent to handle google-vertex provider
2026-01-03 01:11:03 +01:00
Mario Zechner
9b2d22d26d Update ai package CHANGELOG.md for v0.30.2+ changes
Part of #378
2025-12-30 23:09:13 +01:00
Anton Kuzmenko
0250b7ac03 Migrate zai provider from Anthropic to OpenAI-compatible API
- Migrate glm-4.5, glm-4.5-air, glm-4.5-flash, glm-4.6, glm-4.7 from anthropic-messages to openai-completions API
- Updated baseUrl from https://api.z.ai/api/anthropic to https://api.z.ai/api/coding/paas/v4
- Added compat setting to disable developer role for zai models
- Filter empty text blocks in openai-completions to avoid zai API validation errors
- Fixed zai provider tests to use OpenAI-style options (reasoningEffort)
2025-12-29 11:54:10 -08:00
Anton Kuzmenko
31cbbd211c fix: update zAI models to use anthropic API and filter empty thinking blocks in messages 2025-12-28 16:31:32 -08:00
Anton Kuzmenko
93ea8298ab fix: update zai model API and baseUrl in generate-models script 2025-12-28 13:29:36 -08:00
Duncan Ogilvie
bf6da8c72f Make model generation deterministic by sorting providers and models 2025-12-27 13:58:51 +01:00
Luke Foster
ee9b498380 Add Gemini 3 preview models to google-gemini-cli provider
- Add gemini-3-pro-preview and gemini-3-flash-preview to Cloud Code Assist
- Handle thinkingLevel config for Gemini 3 (vs thinkingBudget for Gemini 2.x)
- Gemini 3 Pro: LOW/HIGH levels only
- Gemini 3 Flash: all four levels (MINIMAL/LOW/MEDIUM/HIGH)
2025-12-20 22:10:47 -06:00
Mario Zechner
c359023c3f Add Google Gemini CLI and Antigravity OAuth providers
- Add google-gemini-cli provider: free Gemini 2.0/2.5 via Cloud Code Assist
- Add google-antigravity provider: free Gemini 3, Claude, GPT-OSS via sandbox
- Move OAuth infrastructure from coding-agent to ai package
- Fix thinking signature handling for cross-model handoff
- Fix OpenAI message ID length limit (max 64 chars)
- Add GitHub Copilot overflow pattern detection
- Add OAuth provider tests for context overflow and streaming
2025-12-20 21:34:18 +01:00
Mario Zechner
36e17933d5 feat(ai): add Google Cloud Code Assist provider
- Add new API type 'google-cloud-code-assist' for Gemini CLI / Antigravity auth
- Extract shared Google utilities to google-shared.ts
- Implement streaming provider for Cloud Code Assist endpoint
- Add 7 models: gemini-3-pro-high/low, gemini-3-flash, claude-sonnet/opus, gpt-oss

Models use OAuth authentication and have sh cost (uses Google account quota).
OAuth flow will be implemented in coding-agent in a follow-up.
2025-12-20 10:20:30 +01:00
Aadish Verma
314ef34ebc
feat: implement thinking for some more copilot models (#234)
Signed-off-by: StarLight842 <mail@aadishv.dev>
2025-12-19 04:42:23 +01:00
Mario Zechner
c5543f7586 GitHub Copilot: auto-enable models, fix gpt-5 API, normalize tool call IDs
- Auto-enable all models after /login via POST /models/{model}/policy
- Use openai-responses API for gpt-5/o3/o4 models (not accessible via completions)
- Normalize tool call IDs when switching between github-copilot models with different APIs
  (fixes #198: openai-responses generates 450+ char IDs with special chars that break other models)
- Update README with streamlined GitHub Copilot docs
2025-12-15 20:06:11 +01:00
Mario Zechner
b66157c649 Add GitHub Copilot support (#191)
- OAuth login for GitHub Copilot via /login command
- Support for github.com and GitHub Enterprise
- Models sourced from models.dev (Claude, GPT, Gemini, Grok, etc.)
- Dynamic base URL from token's proxy-ep field
- Use vscode-chat integration ID for API compatibility
- Documentation for model enablement at github.com/settings/copilot/features

Co-authored-by: cau1k <cau1k@users.noreply.github.com>
2025-12-15 19:05:17 +01:00
cau1k
1871962e2e
fix: model context windows 2025-12-15 00:17:59 -05:00
cau1k
7d4cdd09c3
feat: added filter for generate-models and regenerated
- blacklisted gpt-4o-2024-08-06, gpt-4o-2024-11-20, gpt-3.5-turbo-0613,
gpt-4, gpt-4-0613
2025-12-14 22:26:43 -05:00
cau1k
5f590b7c53
feat: generate models base on copilot /models endpoint, requires GH token 2025-12-14 18:42:57 -05:00
cau1k
17ebb9a19d
feat: models.dev in generate models - too many deprecated models
could have opted for a whitelist but we'll just fetch from the copilot
/models endpoint
2025-12-14 17:47:42 -05:00
cau1k
ccae7a4e0e
feat: initial impl
- add GitHub Copilot model discovery (env token fallback, headers,
compat) plus fallback list and quoted provider keys in generated map
- surface Copilot provider end-to-end (KnownProvider/default, env+OAuth
token refresh/save, enterprise base URL swap, available only when
creds/env exist)
- tweak interactive OAuth UI to render instruction text and prompt
placeholders

gpt-5.2-high took about 35 minutes. It had a lot of trouble with `npm
check`  and went off on a "let's adjust every tsconfig" side quest.
Device code flow works, but the ai/scripts/generate-models.ts impl is
wrong as models from months ago are missing and only those deprecated
are accessible in the /models picker.
2025-12-14 17:18:13 -05:00
Mario Zechner
99b4b1aca0 Add Mistral as AI provider
- Add Mistral to KnownProvider type and model generation
- Implement Mistral-specific compat handling in openai-completions:
  - requiresToolResultName: tool results need name field
  - requiresAssistantAfterToolResult: synthetic assistant message between tool/user
  - requiresThinkingAsText: thinking blocks as <thinking> text
  - requiresMistralToolIds: tool IDs must be exactly 9 alphanumeric chars
- Add MISTRAL_API_KEY environment variable support
- Add Mistral tests across all test files
- Update documentation (README, CHANGELOG) for both ai and coding-agent packages
- Remove client IDs from gemini.md, reference upstream source instead

Closes #165
2025-12-10 20:36:19 +01:00
Mario Zechner
c7585e37c9 Release v0.12.10 2025-12-04 20:51:57 +01:00
Mario Zechner
213bc4df1c mom: add centralized logging, usage tracking, and improve prompt caching
Major improvements to mom's logging and cost reporting:

Centralized Logging System:
- Add src/log.ts with type-safe logging functions
- Colored console output (green=user, yellow=mom, dim=details)
- Consistent format: [HH:MM:SS] [context] message
- Replace scattered console.log/error calls throughout codebase

Usage Tracking & Cost Reporting:
- Track tokens (input, output, cache read/write) and costs per run
- Display summary at end of each run in console and Slack thread
- Example: 💰 Usage: 12,543 in + 847 out (5,234 cache read) = $0.0234

Prompt Caching Optimization:
- Move recent messages from system prompt to user message
- System prompt now mostly static (only changes with memory files)
- Enables effective use of Anthropic's prompt caching
- Significantly reduces costs on subsequent requests

Model & Cost Improvements:
- Switch from Claude Opus 4.5 to Sonnet 4.5 (~40% cost reduction)
- Fix Claude Opus 4.5 cache pricing in ai package (was 3x too expensive)
- Add manual override in generate-models.ts until upstream fix merges
- Submitted PR to models.dev: https://github.com/sst/models.dev/pull/439

UI/UX Improvements:
- Extract actual text from tool results instead of JSON wrapper
- Cleaner Slack thread formatting with duration and labels
- Tool args formatting shows paths with offset:limit notation
- Add chalk for colored terminal output

Dependencies:
- Add chalk package for terminal colors
2025-11-26 18:04:16 +01:00
Mario Zechner
38ac29acfb Add ANSI-aware word wrapping to TUI components
- Created shared wrapTextWithAnsi() function in utils.ts
- Handles word-based wrapping while preserving ANSI escape codes
- Properly tracks active ANSI codes across wrapped lines
- Supports multi-byte characters (emoji, surrogate pairs)
- Updated Markdown and Text components to use shared wrapping
- Removed duplicate wrapping logic (158 lines total)
2025-11-18 22:26:24 +01:00