mirror of
https://github.com/getcompanion-ai/co-mono.git
synced 2026-04-21 03:04:28 +00:00
feat(agent): Comprehensive reasoning token support across providers
Added provider-specific reasoning/thinking token support for: - OpenAI (o1, o3, gpt-5): Full reasoning events via Responses API, token counts via Chat Completions - Groq: reasoning_format:"parsed" for Chat Completions, no summary support for Responses - Gemini 2.5: extra_body.google.thinking_config with <thought> tag extraction - OpenRouter: Unified reasoning parameter with message.reasoning field - Anthropic: Limited support via OpenAI compatibility layer Key improvements: - Centralized provider detection based on baseURL - parseReasoningFromMessage() extracts provider-specific reasoning content - adjustRequestForProvider() handles provider-specific request modifications - Smart reasoning support detection with caching per API type - Comprehensive README documentation with provider support matrix Fixes reasoning tokens not appearing for GPT-5 and other reasoning models.
This commit is contained in:
parent
62d9eefc2a
commit
99ce76d66e
5 changed files with 345 additions and 58 deletions
|
|
@ -1,6 +1,6 @@
|
|||
# Fix Missing Thinking Tokens for GPT-5 and Anthropic Models
|
||||
**Status:** AwaitingCommit
|
||||
**Agent PID:** 27674
|
||||
**Agent PID:** 41002
|
||||
|
||||
## Original Todo
|
||||
agent: we do not get thinking tokens for gpt-5. possibly also not for anthropic models?
|
||||
|
|
@ -25,6 +25,18 @@ The agent doesn't extract or report reasoning/thinking tokens from OpenAI's reas
|
|||
- [x] Fix: Add reasoning support detection for Chat Completions API
|
||||
- [x] Fix: Add correct summary parameter value and increase max_output_tokens for preflight check
|
||||
- [x] Investigate: Chat Completions API has reasoning tokens but no thinking events
|
||||
- [x] Debug: Add logging to understand gpt-5 response structure in responses API
|
||||
- [x] Fix: Change reasoning summary from "auto" to "always" to ensure reasoning text is always returned
|
||||
- [x] Fix: Set correct effort levels - "minimal" for responses API, "low" for completions API
|
||||
- [x] Add note to README about Chat Completions API not returning thinking content
|
||||
- [x] Add Gemini API example to README
|
||||
- [x] Verify Gemini thinking token support and update README accordingly
|
||||
- [x] Add special case for Gemini to include extra_body with thinking_config
|
||||
- [x] Add special case for Groq responses API (doesn't support reasoning.summary)
|
||||
- [x] Refactor: Create centralized provider-specific request adjustment function
|
||||
- [x] Refactor: Extract message content parsing into parseReasoningFromMessage() function
|
||||
- [x] Test: Verify Groq reasoning extraction works with refactored code
|
||||
- [x] Test: Verify Gemini thinking extraction works with refactored code
|
||||
|
||||
## Notes
|
||||
User reported that o3 model with responses API doesn't show reasoning tokens or thinking events.
|
||||
|
|
@ -36,5 +48,11 @@ Fixed by:
|
|||
5. Parsing both reasoning_text (o1/o3) and summary_text (gpt-5) formats
|
||||
6. Displaying reasoning tokens in console and TUI renderers with ⚡ symbol
|
||||
7. Properly handling reasoning_effort for Chat Completions API
|
||||
8. Set correct effort levels: "minimal" for Responses API, "low" for Chat Completions API
|
||||
9. Set summary to "always" for Responses API
|
||||
|
||||
**Important finding**: Chat Completions API by design only returns reasoning token *counts* but not the actual thinking/reasoning content for o1 models. This is expected behavior - only the Responses API exposes thinking events.
|
||||
**Important findings**:
|
||||
- Chat Completions API by design only returns reasoning token *counts* but not the actual thinking/reasoning content for o1 models. This is expected behavior - only the Responses API exposes thinking events.
|
||||
- GPT-5 models currently return empty summary arrays even with `summary: "detailed"` - the model indicates it "can't share step-by-step reasoning". This appears to be a model limitation/behavior rather than a code issue.
|
||||
- The reasoning tokens ARE being used and counted correctly when the model chooses to use them.
|
||||
- With effort="minimal" and summary="detailed", gpt-5 sometimes chooses not to use reasoning at all for simple questions.
|
||||
Loading…
Add table
Add a link
Reference in a new issue