feat(ai): add strictResponsesPairing for Azure OpenAI Responses API

Split OpenAICompat into OpenAICompletionsCompat and OpenAIResponsesCompat for type-safe API-specific compat settings. Added strictResponsesPairing option to suppress orphaned reasoning/tool calls on incomplete turns, fixing 400 errors on Azure's Responses API which requires strict pairing. Closes #768
2026-04-21 07:02:04 +00:00 · 2026-01-18 20:15:26 +01:00 · 2026-01-18 20:15:26 +01:00 · d43930c818
commit d43930c818
parent def9e4e9a9
17 changed files with 112 additions and 23 deletions
--- a/packages/ai/README.md
+++ b/packages/ai/README.md
@ -703,16 +703,20 @@ const response = await stream(ollamaModel, context, {

 ### OpenAI Compatibility Settings

-The `openai-completions` API is implemented by many providers with minor differences. By default, the library auto-detects compatibility settings based on `baseUrl` for known providers (Cerebras, xAI, Mistral, Chutes, etc.). For custom proxies or unknown endpoints, you can override these settings via the `compat` field:
+The `openai-completions` API is implemented by many providers with minor differences. By default, the library auto-detects compatibility settings based on `baseUrl` for known providers (Cerebras, xAI, Mistral, Chutes, etc.). For custom proxies or unknown endpoints, you can override these settings via the `compat` field. For `openai-responses` models, the compat field only supports Responses-specific flags.

 ```typescript
-interface OpenAICompat {
+interface OpenAICompletionsCompat {
  supportsStore?: boolean;           // Whether provider supports the `store` field (default: true)
  supportsDeveloperRole?: boolean;   // Whether provider supports `developer` role vs `system` (default: true)
  supportsReasoningEffort?: boolean; // Whether provider supports `reasoning_effort` (default: true)
  maxTokensField?: 'max_completion_tokens' | 'max_tokens';  // Which field name to use (default: max_completion_tokens)
  thinkingFormat?: 'openai' | 'zai'; // Format for reasoning param: 'openai' uses reasoning_effort, 'zai' uses thinking: { type: "enabled" } (default: openai)
 }
+
+interface OpenAIResponsesCompat {
+  strictResponsesPairing?: boolean; // Enforce strict reasoning/message pairing for OpenAI Responses history replay on providers like Azure (default: false)
+}
 ```

 If `compat` is not set, the library falls back to URL-based detection. If `compat` is partially set, unspecified fields use the detected defaults. This is useful for: