Add image support in tool results across all providers

Tool results now use content blocks and can include both text and images.
All providers (Anthropic, Google, OpenAI Completions, OpenAI Responses)
correctly pass images from tool results to LLMs.

- Update ToolResultMessage type to use content blocks
- Add placeholder text for image-only tool results in Google/Anthropic
- OpenAI providers send tool result + follow-up user message with images
- Fix Anthropic JSON parsing for empty tool arguments
- Add comprehensive tests for image-only and text+image tool results
- Update README with tool result content blocks API
This commit is contained in:
Mario Zechner 2025-11-12 10:45:56 +01:00
parent 9dac37d836
commit 84dcab219b
37 changed files with 720 additions and 544 deletions

View file

@ -1,66 +0,0 @@
{
"name": "example",
"version": "1.0.0",
"description": "A JSON file formatted with tabs",
"main": "index.js",
"type": "module",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"start": "node index.js",
"dev": "nodemon index.js",
"build": "tsc",
"lint": "eslint .",
"format": "prettier --write .",
"clean": "rm -rf dist node_modules"
},
"keywords": [
"example",
"json",
"tabs",
"nodejs",
"typescript",
"api"
],
"author": "Assistant",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/example/example-repo.git"
},
"bugs": {
"url": "https://github.com/example/example-repo/issues"
},
"homepage": "https://github.com/example/example-repo#readme",
"engines": {
"node": ">=18.0.0",
"npm": ">=9.0.0"
},
"dependencies": {
"express": "^4.18.0",
"dotenv": "^16.0.3",
"axios": "^1.6.0",
"lodash": "^4.17.21",
"mongoose": "^8.0.0",
"redis": "^4.6.0",
"jsonwebtoken": "^9.0.2",
"bcrypt": "^5.1.1",
"winston": "^3.11.0"
},
"devDependencies": {
"@types/node": "^20.10.0",
"@types/express": "^4.17.21",
"@types/bcrypt": "^5.0.2",
"@types/jsonwebtoken": "^9.0.5",
"typescript": "^5.3.3",
"nodemon": "^3.0.2",
"eslint": "^8.55.0",
"prettier": "^3.1.1",
"vitest": "^1.0.4",
"supertest": "^6.3.3"
},
"config": {
"port": 3000,
"env": "development"
},
"private": false
}

View file

@ -1,79 +0,0 @@
The Amazing Adventures of Fox and Dog
======================================
Long ago, in a mystical forest clearing, there lived an incredibly fast brown fox.
This legendary fox was renowned throughout the entire woodland for its incredible speed and agility.
Each dawn, the fox would sprint through the ancient trees, soaring over logs and babbling brooks.
The woodland creatures gazed in wonder as it flashed past them like a streak of copper lightning.
At the clearing's edge, there also lived a very lazy dog.
This happy dog much preferred napping in the warm sunshine to any kind of adventure.
One fateful morning, the fox challenged the dog to an epic race across the meadow.
The dog yawned deeply and declined, saying "Why rush around when you can rest peacefully?"
The fox laughed and zipped away, exploring distant hills and valleys.
The dog simply rolled over and continued its peaceful slumber.
As the sun set, the fox returned, exhausted from its day of running.
The dog opened one eye and wagged its tail contentedly.
"I've seen the whole world today!" exclaimed the tired fox proudly.
"And I've enjoyed every moment right here," replied the lazy dog.
Sometimes speed and adventure bring joy to life's journey.
Other times, stillness and contentment are the greatest treasures.
Both the quick fox and the lazy dog lived happily in their own ways.
And so their friendship grew stronger with each passing season.
The fox would return from adventures with tales of distant lands.
The dog would listen contentedly, never needing to leave home.
They learned that happiness comes in many different forms.
The forest creatures admired their unlikely bond.
Some days the fox would rest beside the dog in the sunshine.
Other days the dog would take a short stroll with the fox.
They discovered balance between motion and stillness.
The wise old owl observed them from his towering oak.
He noted that both had found their true nature.
Winter came and blanketed the forest in sparkling snow.
The fox's copper fur stood out against the white landscape.
The dog found a cozy spot by the warmest rock.
They shared stories as snowflakes drifted down around them.
Spring arrived with flowers blooming across the meadow.
The fox chased butterflies through fields of wildflowers.
The dog rolled in patches of soft clover and sweet grass.
Summer brought long days of golden light and warmth.
The fox discovered hidden streams in the deep forest.
The dog found the perfect shady spot beneath an elm tree.
Autumn painted the woods in brilliant reds and golds.
The fox leaped through piles of crunchy fallen leaves.
The dog watched the changing colors from his favorite perch.
Years passed and both grew wiser in their own ways.
The fox learned when to rest and the dog learned when to play.
Young animals would visit to hear their wisdom.
"Be true to yourself," the fox would always say.
"Find joy in your own path," the dog would add.
Their story spread throughout the woodland realm.
It became a tale told to every new generation.
Parents would share it with their curious young ones.
Teachers would use it in lessons about acceptance.
Travelers would stop to see the famous pair.
Artists painted pictures of the fox and dog together.
Poets wrote verses about their enduring friendship.
Musicians composed songs celebrating their harmony.
The clearing became a place of peace and understanding.
All creatures were welcome to rest there.
The fox still runs when the spirit moves him.
The dog still naps when the mood strikes him.
Neither judges the other for their choices.
Both have found contentment in being themselves.
The moon rises over the peaceful forest each night.
Stars twinkle above the quiet clearing.
The fox and dog sleep side by side.
Dreams of adventure and rest mingle together.
Morning will bring new possibilities for both.
But tonight, all is calm and perfect.
This is how true friendship looks.
The End.

View file

@ -1,263 +0,0 @@
{
"project": {
"id": "proj_9876543210",
"name": "Advanced E-Commerce Platform",
"description": "A comprehensive multi-vendor marketplace with real-time analytics",
"status": "active",
"created": "2024-01-15T08:30:00Z",
"updated": "2024-03-20T14:45:00Z",
"version": "2.4.1"
},
"team": {
"members": [
{
"id": "usr_001",
"name": "Sarah Chen",
"role": "Lead Developer",
"email": "sarah.chen@example.com",
"skills": ["TypeScript", "React", "Node.js", "PostgreSQL"],
"joined": "2023-06-01",
"active": true
},
{
"id": "usr_002",
"name": "Marcus Johnson",
"role": "Backend Engineer",
"email": "marcus.j@example.com",
"skills": ["Python", "Django", "Redis", "Docker"],
"joined": "2023-07-15",
"active": true
},
{
"id": "usr_003",
"name": "Elena Rodriguez",
"role": "UX Designer",
"email": "elena.r@example.com",
"skills": ["Figma", "UI/UX", "Prototyping", "User Research"],
"joined": "2023-08-20",
"active": true
},
{
"id": "usr_004",
"name": "Ahmed Hassan",
"role": "DevOps Engineer",
"email": "ahmed.h@example.com",
"skills": ["Kubernetes", "AWS", "Terraform", "CI/CD"],
"joined": "2023-09-10",
"active": true
}
],
"departments": ["Engineering", "Design", "Operations", "Marketing"]
},
"features": {
"authentication": {
"enabled": true,
"providers": ["email", "google", "github", "facebook"],
"mfa": true,
"sessionTimeout": 3600,
"passwordPolicy": {
"minLength": 12,
"requireUppercase": true,
"requireNumbers": true,
"requireSpecialChars": true
}
},
"payments": {
"enabled": true,
"gateways": ["stripe", "paypal", "square"],
"currencies": ["USD", "EUR", "GBP", "JPY", "CAD", "AUD"],
"refunds": true,
"subscriptions": true
},
"analytics": {
"enabled": true,
"realtime": true,
"metrics": ["pageViews", "conversions", "revenue", "userActivity"],
"reporting": {
"daily": true,
"weekly": true,
"monthly": true,
"custom": true
}
}
},
"infrastructure": {
"cloud": {
"provider": "AWS",
"region": "us-east-1",
"zones": ["us-east-1a", "us-east-1b", "us-east-1c"],
"services": {
"compute": ["EC2", "Lambda", "ECS"],
"storage": ["S3", "EBS", "EFS"],
"database": ["RDS", "DynamoDB", "ElastiCache"],
"networking": ["VPC", "CloudFront", "Route53"]
}
},
"monitoring": {
"tools": ["Prometheus", "Grafana", "DataDog", "Sentry"],
"alerts": {
"email": true,
"slack": true,
"pagerduty": true
}
}
},
"api": {
"version": "v2",
"baseUrl": "https://api.example.com",
"endpoints": [
{
"path": "/users",
"methods": ["GET", "POST", "PUT", "DELETE"],
"auth": true,
"rateLimit": 1000
},
{
"path": "/products",
"methods": ["GET", "POST", "PUT", "DELETE"],
"auth": true,
"rateLimit": 5000
},
{
"path": "/orders",
"methods": ["GET", "POST", "PUT"],
"auth": true,
"rateLimit": 2000
},
{
"path": "/analytics",
"methods": ["GET"],
"auth": true,
"rateLimit": 500
}
],
"documentation": "https://docs.example.com/api"
},
"database": {
"primary": {
"type": "PostgreSQL",
"version": "15.2",
"host": "db-primary.example.com",
"port": 5432,
"replicas": 3,
"backup": {
"enabled": true,
"frequency": "hourly",
"retention": 30
}
},
"cache": {
"type": "Redis",
"version": "7.0",
"host": "cache.example.com",
"port": 6379,
"ttl": 3600
}
},
"security": {
"ssl": {
"enabled": true,
"provider": "LetsEncrypt",
"autoRenew": true
},
"firewall": {
"enabled": true,
"rules": [
{
"name": "allow-https",
"port": 443,
"protocol": "TCP",
"source": "0.0.0.0/0"
},
{
"name": "allow-http",
"port": 80,
"protocol": "TCP",
"source": "0.0.0.0/0"
},
{
"name": "allow-ssh",
"port": 22,
"protocol": "TCP",
"source": "10.0.0.0/8"
}
]
},
"scanning": {
"vulnerabilities": true,
"dependencies": true,
"secrets": true
}
},
"testing": {
"unit": {
"framework": "Vitest",
"coverage": 87.5,
"threshold": 80
},
"integration": {
"framework": "Playwright",
"browsers": ["chromium", "firefox", "webkit"],
"coverage": 72.3
},
"e2e": {
"framework": "Cypress",
"coverage": 65.8
}
},
"deployment": {
"strategy": "blue-green",
"automation": true,
"environments": [
{
"name": "development",
"url": "https://dev.example.com",
"branch": "develop",
"autoDeployOn": ["push"]
},
{
"name": "staging",
"url": "https://staging.example.com",
"branch": "staging",
"autoDeployOn": ["pull_request"]
},
{
"name": "production",
"url": "https://example.com",
"branch": "main",
"autoDeployOn": ["tag"]
}
]
},
"logs": {
"level": "info",
"format": "json",
"retention": 90,
"aggregation": {
"enabled": true,
"service": "CloudWatch",
"queries": [
"error count by hour",
"request latency p95",
"unique users per day"
]
}
},
"compliance": {
"gdpr": true,
"ccpa": true,
"hipaa": false,
"soc2": true,
"dataRetention": {
"user": 2555,
"logs": 90,
"backups": 30
}
},
"metadata": {
"tags": ["production", "ecommerce", "marketplace", "saas"],
"owner": "engineering-team",
"costCenter": "CC-2024-001",
"criticality": "high"
}
}

View file

@ -95,7 +95,7 @@ export const bashTool: AgentTool<typeof bashSchema> = {
if (output) output += "\n\n";
reject(new Error(`${output}Command exited with code ${code}`));
} else {
resolve({ output: output || "(no output)", details: undefined });
resolve({ content: [{ type: "text", text: output || "(no output)" }], details: undefined });
}
});

View file

@ -37,7 +37,7 @@ export const editTool: AgentTool<typeof editSchema> = {
) => {
const absolutePath = resolvePath(expandPath(path));
return new Promise<{ output: string; details: undefined }>((resolve, reject) => {
return new Promise<{ content: Array<{ type: "text"; text: string }>; details: undefined }>((resolve, reject) => {
// Check if already aborted
if (signal?.aborted) {
reject(new Error("Operation aborted"));
@ -131,7 +131,12 @@ export const editTool: AgentTool<typeof editSchema> = {
}
resolve({
output: `Successfully replaced text in ${path}. Changed ${oldText.length} characters to ${newText.length} characters.`,
content: [
{
type: "text",
text: `Successfully replaced text in ${path}. Changed ${oldText.length} characters to ${newText.length} characters.`,
},
],
details: undefined,
});
} catch (error: any) {

View file

@ -1,9 +1,9 @@
import * as os from "node:os";
import type { AgentTool } from "@mariozechner/pi-ai";
import type { AgentTool, ImageContent, TextContent } from "@mariozechner/pi-ai";
import { Type } from "@sinclair/typebox";
import { constants } from "fs";
import { access, readFile } from "fs/promises";
import { resolve as resolvePath } from "path";
import { extname, resolve as resolvePath } from "path";
/**
* Expand ~ to home directory
@ -18,6 +18,27 @@ function expandPath(filePath: string): string {
return filePath;
}
/**
* Map of file extensions to MIME types for common image formats
*/
const IMAGE_MIME_TYPES: Record<string, string> = {
".jpg": "image/jpeg",
".jpeg": "image/jpeg",
".png": "image/png",
".gif": "image/gif",
".webp": "image/webp",
".bmp": "image/bmp",
".svg": "image/svg+xml",
};
/**
* Check if a file is an image based on its extension
*/
function isImageFile(filePath: string): string | null {
const ext = extname(filePath).toLowerCase();
return IMAGE_MIME_TYPES[ext] || null;
}
const readSchema = Type.Object({
path: Type.String({ description: "Path to the file to read (relative or absolute)" }),
});
@ -25,12 +46,14 @@ const readSchema = Type.Object({
export const readTool: AgentTool<typeof readSchema> = {
name: "read",
label: "read",
description: "Read the contents of a file. Returns the full file content as text.",
description:
"Read the contents of a file. Supports text files and images (jpg, png, gif, webp, bmp, svg). Images are sent as attachments to the model.",
parameters: readSchema,
execute: async (_toolCallId: string, { path }: { path: string }, signal?: AbortSignal) => {
const absolutePath = resolvePath(expandPath(path));
const mimeType = isImageFile(absolutePath);
return new Promise<{ output: string; details: undefined }>((resolve, reject) => {
return new Promise<{ content: (TextContent | ImageContent)[]; details: undefined }>((resolve, reject) => {
// Check if already aborted
if (signal?.aborted) {
reject(new Error("Operation aborted"));
@ -68,8 +91,23 @@ export const readTool: AgentTool<typeof readSchema> = {
return;
}
// Read the file
const content = await readFile(absolutePath, "utf-8");
// Read the file based on type
let content: (TextContent | ImageContent)[];
if (mimeType) {
// Read as image (binary)
const buffer = await readFile(absolutePath);
const base64 = buffer.toString("base64");
content = [
{ type: "text", text: `Read image file: ${path}` },
{ type: "image", data: base64, mimeType },
];
} else {
// Read as text
const textContent = await readFile(absolutePath, "utf-8");
content = [{ type: "text", text: textContent }];
}
// Check if aborted after reading
if (aborted) {
@ -81,7 +119,7 @@ export const readTool: AgentTool<typeof readSchema> = {
signal.removeEventListener("abort", onAbort);
}
resolve({ output: content, details: undefined });
resolve({ content, details: undefined });
} catch (error: any) {
// Clean up abort handler
if (signal) {

View file

@ -32,7 +32,7 @@ export const writeTool: AgentTool<typeof writeSchema> = {
const absolutePath = resolvePath(expandPath(path));
const dir = dirname(absolutePath);
return new Promise<{ output: string; details: undefined }>((resolve, reject) => {
return new Promise<{ content: Array<{ type: "text"; text: string }>; details: undefined }>((resolve, reject) => {
// Check if already aborted
if (signal?.aborted) {
reject(new Error("Operation aborted"));
@ -75,7 +75,10 @@ export const writeTool: AgentTool<typeof writeSchema> = {
signal.removeEventListener("abort", onAbort);
}
resolve({ output: `Successfully wrote ${content.length} bytes to ${path}`, details: undefined });
resolve({
content: [{ type: "text", text: `Successfully wrote ${content.length} bytes to ${path}` }],
details: undefined,
});
} catch (error: any) {
// Clean up abort handler
if (signal) {

View file

@ -60,7 +60,10 @@ export class ToolExecutionComponent extends Container {
private contentText: Text;
private toolName: string;
private args: any;
private result?: { output: string; isError: boolean };
private result?: {
content: Array<{ type: string; text?: string; data?: string; mimeType?: string }>;
isError: boolean;
};
constructor(toolName: string, args: any) {
super();
@ -78,7 +81,10 @@ export class ToolExecutionComponent extends Container {
this.updateDisplay();
}
updateResult(result: { output: string; isError: boolean }): void {
updateResult(result: {
content: Array<{ type: string; text?: string; data?: string; mimeType?: string }>;
isError: boolean;
}): void {
this.result = result;
this.updateDisplay();
}
@ -94,6 +100,24 @@ export class ToolExecutionComponent extends Container {
this.contentText.setText(this.formatToolExecution());
}
private getTextOutput(): string {
if (!this.result) return "";
// Extract text from content blocks
const textBlocks = this.result.content?.filter((c: any) => c.type === "text") || [];
const imageBlocks = this.result.content?.filter((c: any) => c.type === "image") || [];
let output = textBlocks.map((c: any) => c.text).join("\n");
// Add indicator for images
if (imageBlocks.length > 0) {
const imageIndicators = imageBlocks.map((img: any) => `[Image: ${img.mimeType}]`).join("\n");
output = output ? `${output}\n${imageIndicators}` : imageIndicators;
}
return output;
}
private formatToolExecution(): string {
let text = "";
@ -104,7 +128,7 @@ export class ToolExecutionComponent extends Container {
if (this.result) {
// Show output without code fences - more minimal
const output = this.result.output.trim();
const output = this.getTextOutput().trim();
if (output) {
const lines = output.split("\n");
const maxLines = 5;
@ -122,7 +146,8 @@ export class ToolExecutionComponent extends Container {
text = chalk.bold("read") + " " + (path ? chalk.cyan(path) : chalk.dim("..."));
if (this.result) {
const lines = this.result.output.split("\n");
const output = this.getTextOutput();
const lines = output.split("\n");
const maxLines = 10;
const displayLines = lines.slice(0, maxLines);
const remaining = lines.length - maxLines;
@ -168,8 +193,9 @@ export class ToolExecutionComponent extends Container {
const content = JSON.stringify(this.args, null, 2);
text += "\n\n" + content;
if (this.result?.output) {
text += "\n" + this.result.output;
const output = this.getTextOutput();
if (output) {
text += "\n" + output;
}
}

View file

@ -244,7 +244,7 @@ export class TuiRenderer {
assistantMsg.stopReason === "aborted" ? "Operation aborted" : assistantMsg.errorMessage || "Error";
for (const [toolCallId, component] of this.pendingTools.entries()) {
component.updateResult({
output: errorMessage,
content: [{ type: "text", text: errorMessage }],
isError: true,
});
}
@ -273,8 +273,12 @@ export class TuiRenderer {
const component = this.pendingTools.get(event.toolCallId);
if (component) {
// Update the component with the result
const content =
typeof event.result === "string"
? [{ type: "text" as const, text: event.result }]
: event.result.content;
component.updateResult({
output: typeof event.result === "string" ? event.result : event.result.output,
content,
isError: event.isError,
});
this.pendingTools.delete(event.toolCallId);
@ -358,7 +362,7 @@ export class TuiRenderer {
? "Operation aborted"
: assistantMsg.errorMessage || "Error";
component.updateResult({
output: errorMessage,
content: [{ type: "text", text: errorMessage }],
isError: true,
});
} else {
@ -373,7 +377,7 @@ export class TuiRenderer {
const component = this.pendingTools.get(toolResultMsg.toolCallId);
if (component) {
component.updateResult({
output: toolResultMsg.output,
content: toolResultMsg.content,
isError: toolResultMsg.isError,
});
// Remove from pending map since it's complete

View file

@ -1,28 +0,0 @@
{
"name": "test-file",
"version": "1.0.0",
"description": "A test JSON file with tab indentation",
"author": "coding-agent",
"data": {
"items": [
{
"id": 1,
"name": "First item",
"active": true
},
{
"id": 2,
"name": "Second item",
"active": false
}
],
"metadata": {
"created": "2024-11-11",
"tags": [
"test",
"example",
"json"
]
}
}
}

View file

@ -7,6 +7,16 @@ import { editTool } from "../src/tools/edit.js";
import { readTool } from "../src/tools/read.js";
import { writeTool } from "../src/tools/write.js";
// Helper to extract text from content blocks
function getTextOutput(result: any): string {
return (
result.content
?.filter((c: any) => c.type === "text")
.map((c: any) => c.text)
.join("\n") || ""
);
}
describe("Coding Agent Tools", () => {
let testDir: string;
@ -29,7 +39,7 @@ describe("Coding Agent Tools", () => {
const result = await readTool.execute("test-call-1", { path: testFile });
expect(result.output).toBe(content);
expect(getTextOutput(result)).toBe(content);
expect(result.details).toBeUndefined();
});
@ -38,8 +48,8 @@ describe("Coding Agent Tools", () => {
const result = await readTool.execute("test-call-2", { path: testFile });
expect(result.output).toContain("Error");
expect(result.output).toContain("File not found");
expect(getTextOutput(result)).toContain("Error");
expect(getTextOutput(result)).toContain("File not found");
});
});
@ -50,8 +60,8 @@ describe("Coding Agent Tools", () => {
const result = await writeTool.execute("test-call-3", { path: testFile, content });
expect(result.output).toContain("Successfully wrote");
expect(result.output).toContain(testFile);
expect(getTextOutput(result)).toContain("Successfully wrote");
expect(getTextOutput(result)).toContain(testFile);
expect(result.details).toBeUndefined();
});
@ -61,7 +71,7 @@ describe("Coding Agent Tools", () => {
const result = await writeTool.execute("test-call-4", { path: testFile, content });
expect(result.output).toContain("Successfully wrote");
expect(getTextOutput(result)).toContain("Successfully wrote");
});
});
@ -77,7 +87,7 @@ describe("Coding Agent Tools", () => {
newText: "testing",
});
expect(result.output).toContain("Successfully replaced");
expect(getTextOutput(result)).toContain("Successfully replaced");
expect(result.details).toBeUndefined();
});
@ -92,7 +102,7 @@ describe("Coding Agent Tools", () => {
newText: "testing",
});
expect(result.output).toContain("Could not find the exact text");
expect(getTextOutput(result)).toContain("Could not find the exact text");
});
it("should fail if text appears multiple times", async () => {
@ -106,7 +116,7 @@ describe("Coding Agent Tools", () => {
newText: "bar",
});
expect(result.output).toContain("Found 3 occurrences");
expect(getTextOutput(result)).toContain("Found 3 occurrences");
});
});
@ -114,20 +124,20 @@ describe("Coding Agent Tools", () => {
it("should execute simple commands", async () => {
const result = await bashTool.execute("test-call-8", { command: "echo 'test output'" });
expect(result.output).toContain("test output");
expect(getTextOutput(result)).toContain("test output");
expect(result.details).toBeUndefined();
});
it("should handle command errors", async () => {
const result = await bashTool.execute("test-call-9", { command: "exit 1" });
expect(result.output).toContain("Command failed");
expect(getTextOutput(result)).toContain("Command failed");
});
it("should respect timeout", async () => {
const result = await bashTool.execute("test-call-10", { command: "sleep 35" });
expect(result.output).toContain("Command failed");
expect(getTextOutput(result)).toContain("Command failed");
}, 35000);
});
});