mirror of
https://github.com/getcompanion-ai/co-mono.git
synced 2026-04-22 00:00:27 +00:00
Moved completed task documentation to done folder after implementing reasoning token support for OpenAI models (o1, o3, gpt-5) across all renderers and APIs
58 lines
No EOL
4.5 KiB
Markdown
58 lines
No EOL
4.5 KiB
Markdown
# Fix Missing Thinking Tokens for GPT-5 and Anthropic Models
|
|
**Status:** Done
|
|
**Agent PID:** 72653
|
|
|
|
## Original Todo
|
|
agent: we do not get thinking tokens for gpt-5. possibly also not for anthropic models?
|
|
|
|
## Description
|
|
The agent doesn't extract or report reasoning/thinking tokens from OpenAI's reasoning models (gpt-5, o1, o3) when using the Chat Completions API. While the codebase has full thinking token support for the Responses API, the Chat Completions API implementation is missing the extraction of `reasoning_tokens` from the `usage.completion_tokens_details` object. This means users don't see how many tokens were used for reasoning, which can be significant (thousands of tokens) for these models.
|
|
|
|
*Read [analysis.md](./analysis.md) in full for detailed codebase research and context*
|
|
|
|
## Implementation Plan
|
|
- [x] Extend AgentEvent token_usage type to include reasoningTokens field (packages/agent/src/agent.ts:16-23)
|
|
- [x] Update Chat Completions API token extraction to include reasoning tokens from usage.completion_tokens_details (packages/agent/src/agent.ts:210-220)
|
|
- [x] Update console renderer to display reasoning tokens in usage metrics (packages/agent/src/renderers/console-renderer.ts:117-121)
|
|
- [x] Update TUI renderer to display reasoning tokens in usage metrics (packages/agent/src/renderers/tui-renderer.ts:219-227)
|
|
- [x] Update JSON renderer to include reasoning tokens in output (packages/agent/src/renderers/json-renderer.ts:20)
|
|
- [x] User test: Run agent with gpt-4o-mini model (or other reasoning model) and verify reasoning token count appears in metrics display
|
|
- [x] Debug: Fix missing reasoningTokens field in JSON output even when value is 0
|
|
- [x] Debug: Investigate why o3 model doesn't report reasoning tokens in responses API
|
|
- [x] Fix: Parse reasoning summaries from gpt-5 models (summary_text vs reasoning_text)
|
|
- [x] Fix: Only send reasoning parameter for models that support it (o3, gpt-5, etc)
|
|
- [x] Fix: Better detection of reasoning support - preflight test instead of hardcoded model names
|
|
- [x] Fix: Add reasoning support detection for Chat Completions API
|
|
- [x] Fix: Add correct summary parameter value and increase max_output_tokens for preflight check
|
|
- [x] Investigate: Chat Completions API has reasoning tokens but no thinking events
|
|
- [x] Debug: Add logging to understand gpt-5 response structure in responses API
|
|
- [x] Fix: Change reasoning summary from "auto" to "always" to ensure reasoning text is always returned
|
|
- [x] Fix: Set correct effort levels - "minimal" for responses API, "low" for completions API
|
|
- [x] Add note to README about Chat Completions API not returning thinking content
|
|
- [x] Add Gemini API example to README
|
|
- [x] Verify Gemini thinking token support and update README accordingly
|
|
- [x] Add special case for Gemini to include extra_body with thinking_config
|
|
- [x] Add special case for Groq responses API (doesn't support reasoning.summary)
|
|
- [x] Refactor: Create centralized provider-specific request adjustment function
|
|
- [x] Refactor: Extract message content parsing into parseReasoningFromMessage() function
|
|
- [x] Test: Verify Groq reasoning extraction works with refactored code
|
|
- [x] Test: Verify Gemini thinking extraction works with refactored code
|
|
|
|
## Notes
|
|
User reported that o3 model with responses API doesn't show reasoning tokens or thinking events.
|
|
Fixed by:
|
|
1. Adding reasoningTokens field to AgentEvent type
|
|
2. Extracting reasoning tokens from both Chat Completions and Responses APIs
|
|
3. Smart preflight detection of reasoning support for both APIs (cached per agent instance)
|
|
4. Only sending reasoning parameter for supported models
|
|
5. Parsing both reasoning_text (o1/o3) and summary_text (gpt-5) formats
|
|
6. Displaying reasoning tokens in console and TUI renderers with ⚡ symbol
|
|
7. Properly handling reasoning_effort for Chat Completions API
|
|
8. Set correct effort levels: "minimal" for Responses API, "low" for Chat Completions API
|
|
9. Set summary to "always" for Responses API
|
|
|
|
**Important findings**:
|
|
- Chat Completions API by design only returns reasoning token *counts* but not the actual thinking/reasoning content for o1 models. This is expected behavior - only the Responses API exposes thinking events.
|
|
- GPT-5 models currently return empty summary arrays even with `summary: "detailed"` - the model indicates it "can't share step-by-step reasoning". This appears to be a model limitation/behavior rather than a code issue.
|
|
- The reasoning tokens ARE being used and counted correctly when the model chooses to use them.
|
|
- With effort="minimal" and summary="detailed", gpt-5 sometimes chooses not to use reasoning at all for simple questions. |