feat(ai): add strictResponsesPairing for Azure OpenAI Responses API

Split OpenAICompat into OpenAICompletionsCompat and OpenAIResponsesCompat
for type-safe API-specific compat settings. Added strictResponsesPairing
option to suppress orphaned reasoning/tool calls on incomplete turns,
fixing 400 errors on Azure's Responses API which requires strict pairing.

Closes #768
This commit is contained in:
Mario Zechner 2026-01-18 20:15:26 +01:00
parent def9e4e9a9
commit d43930c818
17 changed files with 112 additions and 23 deletions

View file

@ -2,6 +2,14 @@
## [Unreleased] ## [Unreleased]
### Added
- Added `OpenAIResponsesCompat` interface with `strictResponsesPairing` option for Azure OpenAI Responses API, which requires strict reasoning/message pairing in history replay ([#768](https://github.com/badlogic/pi-mono/pull/768) by [@nicobako](https://github.com/nicobako))
### Changed
- Split `OpenAICompat` into `OpenAICompletionsCompat` and `OpenAIResponsesCompat` for type-safe API-specific compat settings
## [0.49.0] - 2026-01-17 ## [0.49.0] - 2026-01-17
### Changed ### Changed

View file

@ -703,16 +703,20 @@ const response = await stream(ollamaModel, context, {
### OpenAI Compatibility Settings ### OpenAI Compatibility Settings
The `openai-completions` API is implemented by many providers with minor differences. By default, the library auto-detects compatibility settings based on `baseUrl` for known providers (Cerebras, xAI, Mistral, Chutes, etc.). For custom proxies or unknown endpoints, you can override these settings via the `compat` field: The `openai-completions` API is implemented by many providers with minor differences. By default, the library auto-detects compatibility settings based on `baseUrl` for known providers (Cerebras, xAI, Mistral, Chutes, etc.). For custom proxies or unknown endpoints, you can override these settings via the `compat` field. For `openai-responses` models, the compat field only supports Responses-specific flags.
```typescript ```typescript
interface OpenAICompat { interface OpenAICompletionsCompat {
supportsStore?: boolean; // Whether provider supports the `store` field (default: true) supportsStore?: boolean; // Whether provider supports the `store` field (default: true)
supportsDeveloperRole?: boolean; // Whether provider supports `developer` role vs `system` (default: true) supportsDeveloperRole?: boolean; // Whether provider supports `developer` role vs `system` (default: true)
supportsReasoningEffort?: boolean; // Whether provider supports `reasoning_effort` (default: true) supportsReasoningEffort?: boolean; // Whether provider supports `reasoning_effort` (default: true)
maxTokensField?: 'max_completion_tokens' | 'max_tokens'; // Which field name to use (default: max_completion_tokens) maxTokensField?: 'max_completion_tokens' | 'max_tokens'; // Which field name to use (default: max_completion_tokens)
thinkingFormat?: 'openai' | 'zai'; // Format for reasoning param: 'openai' uses reasoning_effort, 'zai' uses thinking: { type: "enabled" } (default: openai) thinkingFormat?: 'openai' | 'zai'; // Format for reasoning param: 'openai' uses reasoning_effort, 'zai' uses thinking: { type: "enabled" } (default: openai)
} }
interface OpenAIResponsesCompat {
strictResponsesPairing?: boolean; // Enforce strict reasoning/message pairing for OpenAI Responses history replay on providers like Azure (default: false)
}
``` ```
If `compat` is not set, the library falls back to URL-based detection. If `compat` is partially set, unspecified fields use the detected defaults. This is useful for: If `compat` is not set, the library falls back to URL-based detection. If `compat` is partially set, unspecified fields use the detected defaults. This is useful for:

View file

@ -15,7 +15,7 @@ import type {
Context, Context,
Message, Message,
Model, Model,
OpenAICompat, OpenAICompletionsCompat,
StopReason, StopReason,
StreamFunction, StreamFunction,
StreamOptions, StreamOptions,
@ -452,7 +452,7 @@ function maybeAddOpenRouterAnthropicCacheControl(
function convertMessages( function convertMessages(
model: Model<"openai-completions">, model: Model<"openai-completions">,
context: Context, context: Context,
compat: Required<OpenAICompat>, compat: Required<OpenAICompletionsCompat>,
): ChatCompletionMessageParam[] { ): ChatCompletionMessageParam[] {
const params: ChatCompletionMessageParam[] = []; const params: ChatCompletionMessageParam[] = [];
@ -681,9 +681,9 @@ function mapStopReason(reason: ChatCompletionChunk.Choice["finish_reason"]): Sto
/** /**
* Detect compatibility settings from provider and baseUrl for known providers. * Detect compatibility settings from provider and baseUrl for known providers.
* Provider takes precedence over URL-based detection since it's explicitly configured. * Provider takes precedence over URL-based detection since it's explicitly configured.
* Returns a fully resolved OpenAICompat object with all fields set. * Returns a fully resolved OpenAICompletionsCompat object with all fields set.
*/ */
function detectCompat(model: Model<"openai-completions">): Required<OpenAICompat> { function detectCompat(model: Model<"openai-completions">): Required<OpenAICompletionsCompat> {
const provider = model.provider; const provider = model.provider;
const baseUrl = model.baseUrl; const baseUrl = model.baseUrl;
@ -725,7 +725,7 @@ function detectCompat(model: Model<"openai-completions">): Required<OpenAICompat
* Get resolved compatibility settings for a model. * Get resolved compatibility settings for a model.
* Uses explicit model.compat if provided, otherwise auto-detects from provider/URL. * Uses explicit model.compat if provided, otherwise auto-detects from provider/URL.
*/ */
function getCompat(model: Model<"openai-completions">): Required<OpenAICompat> { function getCompat(model: Model<"openai-completions">): Required<OpenAICompletionsCompat> {
const detected = detectCompat(model); const detected = detectCompat(model);
if (!model.compat) return detected; if (!model.compat) return detected;

View file

@ -461,10 +461,22 @@ function convertMessages(model: Model<"openai-responses">, context: Context): Re
} }
} else if (msg.role === "assistant") { } else if (msg.role === "assistant") {
const output: ResponseInput = []; const output: ResponseInput = [];
const strictResponsesPairing = model.compat?.strictResponsesPairing ?? false;
let isIncomplete = false;
let shouldReplayReasoning = msg.stopReason !== "error";
let allowToolCalls = msg.stopReason !== "error";
if (strictResponsesPairing) {
isIncomplete = msg.stopReason === "error" || msg.stopReason === "aborted";
const hasPairedContent = msg.content.some(
(b) => b.type === "toolCall" || (b.type === "text" && (b as TextContent).text.trim().length > 0),
);
shouldReplayReasoning = !isIncomplete && hasPairedContent;
allowToolCalls = !isIncomplete;
}
for (const block of msg.content) { for (const block of msg.content) {
// Do not submit thinking blocks if the completion had an error (i.e. abort) // Do not submit thinking blocks if the completion had an error (i.e. abort)
if (block.type === "thinking" && msg.stopReason !== "error") { if (block.type === "thinking" && shouldReplayReasoning) {
if (block.thinkingSignature) { if (block.thinkingSignature) {
const reasoningItem = JSON.parse(block.thinkingSignature); const reasoningItem = JSON.parse(block.thinkingSignature);
output.push(reasoningItem); output.push(reasoningItem);
@ -475,6 +487,11 @@ function convertMessages(model: Model<"openai-responses">, context: Context): Re
let msgId = textBlock.textSignature; let msgId = textBlock.textSignature;
if (!msgId) { if (!msgId) {
msgId = `msg_${msgIndex}`; msgId = `msg_${msgIndex}`;
}
// For incomplete turns, never replay the original message id (if any).
// Generate a stable synthetic id so strict pairing providers do not expect a paired reasoning item.
if (strictResponsesPairing && isIncomplete) {
msgId = `msg_${msgIndex}_${shortHash(textBlock.text)}`;
} else if (msgId.length > 64) { } else if (msgId.length > 64) {
msgId = `msg_${shortHash(msgId)}`; msgId = `msg_${shortHash(msgId)}`;
} }
@ -486,7 +503,7 @@ function convertMessages(model: Model<"openai-responses">, context: Context): Re
id: msgId, id: msgId,
} satisfies ResponseOutputMessage); } satisfies ResponseOutputMessage);
// Do not submit toolcall blocks if the completion had an error (i.e. abort) // Do not submit toolcall blocks if the completion had an error (i.e. abort)
} else if (block.type === "toolCall" && msg.stopReason !== "error") { } else if (block.type === "toolCall" && allowToolCalls) {
const toolCall = block as ToolCall; const toolCall = block as ToolCall;
output.push({ output.push({
type: "function_call", type: "function_call",

View file

@ -204,10 +204,10 @@ export type AssistantMessageEvent =
| { type: "error"; reason: Extract<StopReason, "aborted" | "error">; error: AssistantMessage }; | { type: "error"; reason: Extract<StopReason, "aborted" | "error">; error: AssistantMessage };
/** /**
* Compatibility settings for openai-completions API. * Compatibility settings for OpenAI-compatible completions APIs.
* Use this to override URL-based auto-detection for custom providers. * Use this to override URL-based auto-detection for custom providers.
*/ */
export interface OpenAICompat { export interface OpenAICompletionsCompat {
/** Whether the provider supports the `store` field. Default: auto-detected from URL. */ /** Whether the provider supports the `store` field. Default: auto-detected from URL. */
supportsStore?: boolean; supportsStore?: boolean;
/** Whether the provider supports the `developer` role (vs `system`). Default: auto-detected from URL. */ /** Whether the provider supports the `developer` role (vs `system`). Default: auto-detected from URL. */
@ -230,6 +230,12 @@ export interface OpenAICompat {
thinkingFormat?: "openai" | "zai"; thinkingFormat?: "openai" | "zai";
} }
/** Compatibility settings for OpenAI Responses APIs. */
export interface OpenAIResponsesCompat {
/** Whether OpenAI Responses history replay requires strict reasoning/message pairing (for providers like Azure). */
strictResponsesPairing?: boolean;
}
// Model interface for the unified model system // Model interface for the unified model system
export interface Model<TApi extends Api> { export interface Model<TApi extends Api> {
id: string; id: string;
@ -248,6 +254,10 @@ export interface Model<TApi extends Api> {
contextWindow: number; contextWindow: number;
maxTokens: number; maxTokens: number;
headers?: Record<string, string>; headers?: Record<string, string>;
/** Compatibility overrides for openai-completions API. If not set, auto-detected from baseUrl. */ /** Compatibility overrides for OpenAI-compatible APIs. If not set, auto-detected from baseUrl. */
compat?: TApi extends "openai-completions" ? OpenAICompat : never; compat?: TApi extends "openai-completions"
? OpenAICompletionsCompat
: TApi extends "openai-responses"
? OpenAIResponsesCompat
: never;
} }

View file

@ -110,8 +110,10 @@ describe("AI Providers Abort Tests", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider Abort", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider Abort", () => {
const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini")!;
void _compat;
const llm: Model<"openai-completions"> = { const llm: Model<"openai-completions"> = {
...getModel("openai", "gpt-4o-mini")!, ...baseModel,
api: "openai-completions", api: "openai-completions",
}; };

View file

@ -466,7 +466,12 @@ describe("Cross-Provider Handoff Tests", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider Handoff", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider Handoff", () => {
const model: Model<"openai-completions"> = { ...getModel("openai", "gpt-4o-mini"), api: "openai-completions" }; const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini");
void _compat;
const model: Model<"openai-completions"> = {
...baseModel,
api: "openai-completions",
};
it("should handle contexts from all providers", async () => { it("should handle contexts from all providers", async () => {
console.log("\nTesting OpenAI Completions with pre-built contexts:\n"); console.log("\nTesting OpenAI Completions with pre-built contexts:\n");

View file

@ -356,7 +356,12 @@ describe("Image Limits E2E Tests", () => {
// Limits: 500 images, ~20MB per image (documented) // Limits: 500 images, ~20MB per image (documented)
// ------------------------------------------------------------------------- // -------------------------------------------------------------------------
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI (gpt-4o-mini)", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI (gpt-4o-mini)", () => {
const model: Model<"openai-completions"> = { ...getModel("openai", "gpt-4o-mini"), api: "openai-completions" }; const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini");
void _compat;
const model: Model<"openai-completions"> = {
...baseModel,
api: "openai-completions",
};
it("should accept a small number of images (5)", async () => { it("should accept a small number of images (5)", async () => {
const result = await testImageCount(model, 5, smallImage); const result = await testImageCount(model, 5, smallImage);

View file

@ -215,7 +215,12 @@ describe("Tool Results with Images", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider (gpt-4o-mini)", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider (gpt-4o-mini)", () => {
const llm: Model<"openai-completions"> = { ...getModel("openai", "gpt-4o-mini"), api: "openai-completions" }; const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini");
void _compat;
const llm: Model<"openai-completions"> = {
...baseModel,
api: "openai-completions",
};
it("should handle tool result with only image", { retry: 3, timeout: 30000 }, async () => { it("should handle tool result with only image", { retry: 3, timeout: 30000 }, async () => {
await handleToolWithImageResult(llm); await handleToolWithImageResult(llm);

View file

@ -411,7 +411,12 @@ describe("Generate E2E Tests", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider (gpt-4o-mini)", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider (gpt-4o-mini)", () => {
const llm: Model<"openai-completions"> = { ...getModel("openai", "gpt-4o-mini"), api: "openai-completions" }; const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini");
void _compat;
const llm: Model<"openai-completions"> = {
...baseModel,
api: "openai-completions",
};
it("should complete basic text generation", { retry: 3 }, async () => { it("should complete basic text generation", { retry: 3 }, async () => {
await basicTextGeneration(llm); await basicTextGeneration(llm);

View file

@ -86,8 +86,10 @@ describe("Token Statistics on Abort", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider", () => {
const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini")!;
void _compat;
const llm: Model<"openai-completions"> = { const llm: Model<"openai-completions"> = {
...getModel("openai", "gpt-4o-mini")!, ...baseModel,
api: "openai-completions", api: "openai-completions",
}; };

View file

@ -105,8 +105,10 @@ describe("Tool Call Without Result Tests", () => {
}); });
describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider", () => { describe.skipIf(!process.env.OPENAI_API_KEY)("OpenAI Completions Provider", () => {
const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini")!;
void _compat;
const model: Model<"openai-completions"> = { const model: Model<"openai-completions"> = {
...getModel("openai", "gpt-4o-mini")!, ...baseModel,
api: "openai-completions", api: "openai-completions",
}; };

View file

@ -155,8 +155,10 @@ describe("totalTokens field", () => {
"gpt-4o-mini - should return totalTokens equal to sum of components", "gpt-4o-mini - should return totalTokens equal to sum of components",
{ retry: 3, timeout: 60000 }, { retry: 3, timeout: 60000 },
async () => { async () => {
const { compat: _compat, ...baseModel } = getModel("openai", "gpt-4o-mini")!;
void _compat;
const llm: Model<"openai-completions"> = { const llm: Model<"openai-completions"> = {
...getModel("openai", "gpt-4o-mini")!, ...baseModel,
api: "openai-completions", api: "openai-completions",
}; };

View file

@ -51,8 +51,10 @@ describe.skipIf(!process.env.OPENAI_API_KEY)("xhigh reasoning", () => {
}); });
it("should error with openai-completions when using xhigh", async () => { it("should error with openai-completions when using xhigh", async () => {
const { compat: _compat, ...baseModel } = getModel("openai", "gpt-5-mini");
void _compat;
const model: Model<"openai-completions"> = { const model: Model<"openai-completions"> = {
...getModel("openai", "gpt-5-mini"), ...baseModel,
api: "openai-completions", api: "openai-completions",
}; };
const s = stream(model, makeContext(), { reasoningEffort: "xhigh" }); const s = stream(model, makeContext(), { reasoningEffort: "xhigh" });

View file

@ -2,6 +2,10 @@
## [Unreleased] ## [Unreleased]
### Added
- Added `strictResponsesPairing` compat option for custom OpenAI Responses models on Azure ([#768](https://github.com/badlogic/pi-mono/pull/768) by [@nicobako](https://github.com/nicobako))
### Changed ### Changed
- Share URLs now use hash fragments (`#`) instead of query strings (`?`) to prevent session IDs from being sent to buildwithpi.ai ([#828](https://github.com/badlogic/pi-mono/issues/828)) - Share URLs now use hash fragments (`#`) instead of query strings (`?`) to prevent session IDs from being sent to buildwithpi.ai ([#828](https://github.com/badlogic/pi-mono/issues/828))

View file

@ -735,6 +735,8 @@ To fully replace a built-in provider with custom models, include the `models` ar
**OpenAI compatibility (`compat` field):** **OpenAI compatibility (`compat` field):**
**OpenAI Completions (`openai-completions`):**
| Field | Description | | Field | Description |
|-------|-------------| |-------|-------------|
| `supportsStore` | Whether provider supports `store` field | | `supportsStore` | Whether provider supports `store` field |
@ -743,6 +745,14 @@ To fully replace a built-in provider with custom models, include the `models` ar
| `supportsUsageInStreaming` | Whether provider supports `stream_options: { include_usage: true }`. Default: `true` | | `supportsUsageInStreaming` | Whether provider supports `stream_options: { include_usage: true }`. Default: `true` |
| `maxTokensField` | Use `max_completion_tokens` or `max_tokens` | | `maxTokensField` | Use `max_completion_tokens` or `max_tokens` |
**OpenAI Responses (`openai-responses`):**
| Field | Description |
|-------|-------------|
| `strictResponsesPairing` | Enforce strict reasoning/message pairing when replaying OpenAI Responses history on providers like Azure (default: `false`) |
If you see 400 errors like "item of type 'reasoning' was provided without its required following item" or "message/function_call was provided without its required reasoning item", set `compat.strictResponsesPairing: true` on the affected model in `models.json`.
**Live reload:** The file reloads each time you open `/model`. Edit during session; no restart needed. **Live reload:** The file reloads each time you open `/model`. Edit during session; no restart needed.
**Model selection priority:** **Model selection priority:**

View file

@ -20,13 +20,19 @@ import type { AuthStorage } from "./auth-storage.js";
const Ajv = (AjvModule as any).default || AjvModule; const Ajv = (AjvModule as any).default || AjvModule;
// Schema for OpenAI compatibility settings // Schema for OpenAI compatibility settings
const OpenAICompatSchema = Type.Object({ const OpenAICompletionsCompatSchema = Type.Object({
supportsStore: Type.Optional(Type.Boolean()), supportsStore: Type.Optional(Type.Boolean()),
supportsDeveloperRole: Type.Optional(Type.Boolean()), supportsDeveloperRole: Type.Optional(Type.Boolean()),
supportsReasoningEffort: Type.Optional(Type.Boolean()), supportsReasoningEffort: Type.Optional(Type.Boolean()),
maxTokensField: Type.Optional(Type.Union([Type.Literal("max_completion_tokens"), Type.Literal("max_tokens")])), maxTokensField: Type.Optional(Type.Union([Type.Literal("max_completion_tokens"), Type.Literal("max_tokens")])),
}); });
const OpenAIResponsesCompatSchema = Type.Object({
strictResponsesPairing: Type.Optional(Type.Boolean()),
});
const OpenAICompatSchema = Type.Union([OpenAICompletionsCompatSchema, OpenAIResponsesCompatSchema]);
// Schema for custom model definition // Schema for custom model definition
const ModelDefinitionSchema = Type.Object({ const ModelDefinitionSchema = Type.Object({
id: Type.String({ minLength: 1 }), id: Type.String({ minLength: 1 }),