mirror of
https://github.com/getcompanion-ai/co-mono.git
synced 2026-04-15 13:03:42 +00:00
docs(mom): multi-platform redesign with channel isolation
- Event-based architecture: agent forwards AgentSessionEvent to adapters - Multi-adapter support via single config.json - Access control: admins, DM permissions per adapter - Channel namespacing by adapter name - Channel isolation via bubblewrap in Linux/Docker environments - Simplified directory structure and interfaces
This commit is contained in:
parent
c97702cf91
commit
d07d784ed3
1 changed files with 970 additions and 0 deletions
970
packages/mom/docs/new.md
Normal file
970
packages/mom/docs/new.md
Normal file
|
|
@ -0,0 +1,970 @@
|
|||
# Mom Redesign: Multi-Platform Chat Support
|
||||
|
||||
## Goals
|
||||
|
||||
1. Support multiple chat platforms (Slack, Discord, WhatsApp, Telegram, etc.)
|
||||
2. Unified storage layer for all platforms
|
||||
3. Platform-agnostic agent that doesn't care where messages come from
|
||||
4. Adapters that are independently testable
|
||||
5. Agent that is independently testable
|
||||
|
||||
## Current Architecture Problems
|
||||
|
||||
The current architecture tightly couples Slack-specific code throughout:
|
||||
|
||||
```
|
||||
main.ts → SlackBot → handler.handleEvent() → agent.run(SlackContext)
|
||||
↓
|
||||
SlackContext.respond()
|
||||
SlackContext.replaceMessage()
|
||||
SlackContext.respondInThread()
|
||||
etc.
|
||||
```
|
||||
|
||||
Problems:
|
||||
- `SlackContext` interface leaks Slack concepts (threads, typing indicators)
|
||||
- Agent code references Slack-specific formatting (mrkdwn, `<@user>` mentions)
|
||||
- Storage uses Slack timestamps (`ts`) as message IDs
|
||||
- Message logging assumes Slack's event structure
|
||||
- The PR's Discord implementation duplicated most of this logic in a separate package
|
||||
|
||||
## Proposed Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ CLI / Entry Point │
|
||||
│ mom ./data │
|
||||
│ (reads config.json, starts all configured adapters) │
|
||||
└───────────────────────────────────┬─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Platform Adapter │
|
||||
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
||||
│ │ SlackAdapter │ │DiscordAdapter│ │ CLIAdapter │ (for testing) │
|
||||
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
|
||||
│ │ │ │ │
|
||||
│ └────────────────┬┴─────────────────┘ │
|
||||
│ │ │
|
||||
│ ▼ │
|
||||
│ ┌───────────────────────┐ │
|
||||
│ │ PlatformAdapter │ (common interface) │
|
||||
│ │ - onMessage() │ │
|
||||
│ │ - onStop() │ │
|
||||
│ │ - sendMessage() │ │
|
||||
│ │ - updateMessage() │ │
|
||||
│ │ - deleteMessage() │ │
|
||||
│ │ - uploadFile() │ │
|
||||
│ │ - getChannelInfo() │ │
|
||||
│ │ - getUserInfo() │ │
|
||||
│ └───────────┬───────────┘ │
|
||||
└──────────────────────────┼──────────────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ MomAgent │
|
||||
│ - Platform agnostic │
|
||||
│ - Receives messages via handleMessage(message, context, onEvent) │
|
||||
│ - Forwards AgentSessionEvent to adapter via callback │
|
||||
│ - Provides: abort(), isRunning() │
|
||||
└───────────────────────────────────┬─────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ ChannelStore │
|
||||
│ - Unified storage schema for all platforms │
|
||||
│ - log.jsonl: channel history (messages only) │
|
||||
│ - context.jsonl: LLM context (messages + tool results) │
|
||||
│ - attachments/: downloaded files │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Key Interfaces
|
||||
|
||||
### 1. ChannelMessage (Unified Message Format)
|
||||
|
||||
```typescript
|
||||
interface ChannelMessage {
|
||||
/** Unique ID within the channel (platform-specific format preserved) */
|
||||
id: string;
|
||||
|
||||
/** Channel/conversation ID */
|
||||
channelId: string;
|
||||
|
||||
/** Timestamp (ISO 8601) */
|
||||
timestamp: string;
|
||||
|
||||
/** Sender info */
|
||||
sender: {
|
||||
id: string;
|
||||
username: string;
|
||||
displayName?: string;
|
||||
isBot: boolean;
|
||||
};
|
||||
|
||||
/** Message content (as received from platform) */
|
||||
text: string;
|
||||
|
||||
/** Optional: original platform-specific text (for debugging) */
|
||||
rawText?: string;
|
||||
|
||||
/** Attachments */
|
||||
attachments: ChannelAttachment[];
|
||||
|
||||
/** Is this a direct mention/trigger of the bot? */
|
||||
isMention: boolean;
|
||||
|
||||
/** Optional: reply-to message ID (for threaded conversations) */
|
||||
replyTo?: string;
|
||||
|
||||
/** Platform-specific metadata (for platform-specific features) */
|
||||
metadata?: Record<string, unknown>;
|
||||
}
|
||||
|
||||
interface ChannelAttachment {
|
||||
/** Original filename */
|
||||
filename: string;
|
||||
|
||||
/** Local path (relative to channel dir) */
|
||||
localPath: string;
|
||||
|
||||
/** MIME type if known */
|
||||
mimeType?: string;
|
||||
|
||||
/** File size in bytes */
|
||||
size?: number;
|
||||
}
|
||||
```
|
||||
|
||||
### 2. PlatformAdapter
|
||||
|
||||
Adapters handle platform connection and UI. They receive events from MomAgent and render however they want.
|
||||
|
||||
```typescript
|
||||
interface PlatformAdapter {
|
||||
/** Adapter name (used in channel paths, e.g., "slack-acme") */
|
||||
name: string;
|
||||
|
||||
/** Start the adapter (connect to platform) */
|
||||
start(): Promise<void>;
|
||||
|
||||
/** Stop the adapter */
|
||||
stop(): Promise<void>;
|
||||
|
||||
/** Get all known channels */
|
||||
getChannels(): ChannelInfo[];
|
||||
|
||||
/** Get all known users */
|
||||
getUsers(): UserInfo[];
|
||||
}
|
||||
|
||||
interface ChannelInfo {
|
||||
id: string;
|
||||
name: string;
|
||||
type: 'channel' | 'dm' | 'group';
|
||||
}
|
||||
|
||||
interface UserInfo {
|
||||
id: string;
|
||||
username: string;
|
||||
displayName?: string;
|
||||
}
|
||||
```
|
||||
|
||||
### 3. MomAgent
|
||||
|
||||
MomAgent wraps `AgentSession` from coding-agent. Agent is platform-agnostic; it just forwards events to the adapter.
|
||||
|
||||
```typescript
|
||||
import { type AgentSessionEvent } from "@mariozechner/pi-coding-agent";
|
||||
|
||||
interface MomAgent {
|
||||
/**
|
||||
* Handle an incoming message.
|
||||
* Adapter receives events via callback and renders however it wants.
|
||||
*/
|
||||
handleMessage(
|
||||
message: ChannelMessage,
|
||||
context: ChannelContext,
|
||||
onEvent: (event: AgentSessionEvent) => Promise<void>
|
||||
): Promise<{ stopReason: string; errorMessage?: string }>;
|
||||
|
||||
/** Abort the current run for a channel */
|
||||
abort(channelId: string): void;
|
||||
|
||||
/** Check if a channel is currently running */
|
||||
isRunning(channelId: string): boolean;
|
||||
}
|
||||
|
||||
interface ChannelContext {
|
||||
/** Adapter name (for channel path: channels/<adapter>/<channelId>/) */
|
||||
adapter: string;
|
||||
users: UserInfo[];
|
||||
channels: ChannelInfo[];
|
||||
}
|
||||
```
|
||||
|
||||
## Event Handling
|
||||
|
||||
Adapter receives `AgentSessionEvent` and renders however it wants:
|
||||
|
||||
```typescript
|
||||
// Slack adapter example
|
||||
async function handleEvent(event: AgentSessionEvent, ctx: SlackContext) {
|
||||
switch (event.type) {
|
||||
case 'tool_execution_start': {
|
||||
const label = (event.args as any).label || event.toolName;
|
||||
await ctx.updateMain(`_→ ${label}_`);
|
||||
break;
|
||||
}
|
||||
|
||||
case 'tool_execution_end': {
|
||||
// Format tool result for thread
|
||||
const result = extractText(event.result);
|
||||
const formatted = `**${event.toolName}** (${event.durationMs}ms)\n\`\`\`\n${result}\n\`\`\``;
|
||||
await ctx.appendThread(this.toSlackFormat(formatted));
|
||||
break;
|
||||
}
|
||||
|
||||
case 'message_end': {
|
||||
if (event.message.role === 'assistant') {
|
||||
const text = extractAssistantText(event.message);
|
||||
await ctx.replaceMain(this.toSlackFormat(text));
|
||||
await ctx.appendThread(this.toSlackFormat(text));
|
||||
|
||||
// Usage from AssistantMessage
|
||||
if (event.message.usage) {
|
||||
await ctx.appendThread(formatUsage(event.message.usage));
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
case 'auto_compaction_start':
|
||||
await ctx.updateMain('_Compacting context..._');
|
||||
break;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Each adapter decides:
|
||||
- Message formatting (markdown → mrkdwn, embeds, etc.)
|
||||
- Message splitting for platform limits
|
||||
- What goes in main message vs thread
|
||||
- How to show tool results, usage, errors
|
||||
|
||||
## Storage Format
|
||||
|
||||
### log.jsonl (Channel History)
|
||||
|
||||
Messages stored as received from platform:
|
||||
|
||||
```jsonl
|
||||
{"id":"1734567890.123456","ts":"2024-12-20T10:00:00.000Z","sender":{"id":"U123","username":"mario","displayName":"Mario Z","isBot":false},"text":"<@U789> what's the weather?","attachments":[],"isMention":true}
|
||||
{"id":"1734567890.234567","ts":"2024-12-20T10:00:05.000Z","sender":{"id":"bot","username":"mom","isBot":true},"text":"The weather is sunny!","attachments":[]}
|
||||
```
|
||||
|
||||
### context.jsonl (LLM Context)
|
||||
|
||||
Same format as current (coding-agent compatible):
|
||||
|
||||
```jsonl
|
||||
{"type":"session","id":"uuid","timestamp":"...","provider":"anthropic","modelId":"claude-sonnet-4-5"}
|
||||
{"type":"message","timestamp":"...","message":{"role":"user","content":"[mario]: what's the weather?"}}
|
||||
{"type":"message","timestamp":"...","message":{"role":"assistant","content":[{"type":"text","text":"The weather is sunny!"}]}}
|
||||
```
|
||||
|
||||
## Directory Structure
|
||||
|
||||
```
|
||||
data/
|
||||
├── config.json # Host only - tokens, adapters, access control
|
||||
└── workspace/ # Mounted as /workspace in Docker
|
||||
├── MEMORY.md
|
||||
├── skills/
|
||||
├── tools/
|
||||
├── events/
|
||||
└── channels/
|
||||
├── slack-acme/
|
||||
│ └── C0A34FL8PMH/
|
||||
│ ├── MEMORY.md
|
||||
│ ├── log.jsonl
|
||||
│ ├── context.jsonl
|
||||
│ ├── attachments/
|
||||
│ ├── skills/
|
||||
│ └── scratch/
|
||||
└── discord-mybot/
|
||||
└── 1234567890123456789/
|
||||
└── ...
|
||||
```
|
||||
|
||||
**config.json** (not mounted, stays on host):
|
||||
|
||||
```json
|
||||
{
|
||||
"adapters": {
|
||||
"slack-acme": {
|
||||
"type": "slack",
|
||||
"botToken": "xoxb-...",
|
||||
"appToken": "xapp-...",
|
||||
"admins": ["U123", "U456"],
|
||||
"dm": "everyone"
|
||||
},
|
||||
"discord-mybot": {
|
||||
"type": "discord",
|
||||
"botToken": "...",
|
||||
"admins": ["123456789"],
|
||||
"dm": "none"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Access control:**
|
||||
- `admins`: User IDs with admin privileges. Can always DM.
|
||||
- `dm`: Who else can DM. `"everyone"`, `"none"`, or `["U789", "U012"]`
|
||||
|
||||
**Channels** are namespaced by adapter name: `channels/<adapter>/<channelId>/`
|
||||
|
||||
**Events** use qualified channelId: `{"channelId": "slack-acme/C123", ...}`
|
||||
|
||||
**Security note:** Mom has bash access to all channel logs in the workspace. If mom is in a private channel, anyone who can talk to mom could potentially access that channel's history. For true isolation, run separate mom instances with separate data directories.
|
||||
|
||||
### Channel Isolation via Bubblewrap (Linux/Docker)
|
||||
|
||||
In Linux-based execution environments (Docker), we can use [bubblewrap](https://github.com/containers/bubblewrap) to enforce per-user channel access at the OS level.
|
||||
|
||||
**How it works:**
|
||||
1. Adapter knows which channels the requesting user has access to
|
||||
2. Before executing bash, wrap command with bwrap
|
||||
3. Mount entire filesystem, then overlay denied channels with empty tmpfs
|
||||
4. Sandboxed process can't see files in denied channels
|
||||
|
||||
```typescript
|
||||
function wrapWithBwrap(command: string, deniedChannels: string[]): string {
|
||||
const args = [
|
||||
'--bind / /', // Mount everything
|
||||
...deniedChannels.map(ch =>
|
||||
`--tmpfs /workspace/channels/${ch}` // Hide denied channels
|
||||
),
|
||||
'--dev /dev',
|
||||
'--proc /proc',
|
||||
'--die-with-parent',
|
||||
];
|
||||
return `bwrap ${args.join(' ')} -- ${command}`;
|
||||
}
|
||||
|
||||
// Usage
|
||||
const userChannels = adapter.getUserChannels(userId); // ["public", "team-a"]
|
||||
const allChannels = await fs.readdir('/workspace/channels/');
|
||||
const denied = allChannels.filter(ch => !userChannels.includes(ch));
|
||||
|
||||
const sandboxedCmd = wrapWithBwrap('cat /workspace/channels/private/log.jsonl', denied);
|
||||
// Results in: "No such file or directory" - private channel hidden
|
||||
```
|
||||
|
||||
**Requirements:**
|
||||
- Docker container needs `--cap-add=SYS_ADMIN` for bwrap to create namespaces
|
||||
- Install in Dockerfile: `apk add bubblewrap`
|
||||
|
||||
**Limitations:**
|
||||
- Linux only (not macOS host mode)
|
||||
- Requires SYS_ADMIN capability in Docker
|
||||
- Per-execution overhead (though minimal)
|
||||
|
||||
## System Prompt Changes
|
||||
|
||||
The system prompt is platform-agnostic. Agent outputs standard markdown, adapter converts.
|
||||
|
||||
```typescript
|
||||
function buildSystemPrompt(
|
||||
workspacePath: string,
|
||||
channelId: string,
|
||||
memory: string,
|
||||
sandbox: SandboxConfig,
|
||||
context: ChannelContext,
|
||||
skills: Skill[]
|
||||
): string {
|
||||
return `You are mom, a chat bot assistant. Be concise. No emojis.
|
||||
|
||||
## Text Formatting
|
||||
Use standard markdown: **bold**, *italic*, \`code\`, \`\`\`block\`\`\`, [text](url)
|
||||
For mentions, use @username format.
|
||||
|
||||
## Users
|
||||
${context.users.map(u => `@${u.username}\t${u.displayName || ''}`).join('\n')}
|
||||
|
||||
## Channels
|
||||
${context.channels.map(c => `#${c.name}`).join('\n')}
|
||||
|
||||
... rest of prompt ...
|
||||
`;
|
||||
}
|
||||
```
|
||||
|
||||
The adapter converts markdown to platform format internally:
|
||||
|
||||
```typescript
|
||||
// Inside SlackAdapter
|
||||
private formatForSlack(markdown: string): string {
|
||||
let text = markdown;
|
||||
|
||||
// Bold: **text** → *text*
|
||||
text = text.replace(/\*\*(.+?)\*\*/g, '*$1*');
|
||||
|
||||
// Links: [text](url) → <url|text>
|
||||
text = text.replace(/\[(.+?)\]\((.+?)\)/g, '<$2|$1>');
|
||||
|
||||
// Mentions: @username → <@U123>
|
||||
text = text.replace(/@(\w+)/g, (match, username) => {
|
||||
const user = this.users.find(u => u.username === username);
|
||||
return user ? `<@${user.id}>` : match;
|
||||
});
|
||||
|
||||
return text;
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### 1. Agent Tests (with temp Docker container)
|
||||
|
||||
```typescript
|
||||
// test/agent.test.ts
|
||||
import { MomAgent } from '../src/agent.js';
|
||||
import { createTestContainer, destroyTestContainer } from './docker-utils.js';
|
||||
|
||||
describe('MomAgent', () => {
|
||||
let containerName: string;
|
||||
|
||||
beforeAll(async () => {
|
||||
containerName = await createTestContainer();
|
||||
});
|
||||
|
||||
afterAll(async () => {
|
||||
await destroyTestContainer(containerName);
|
||||
});
|
||||
|
||||
it('responds to user message', async () => {
|
||||
const agent = new MomAgent({
|
||||
workDir: tmpDir,
|
||||
sandbox: { type: 'docker', container: containerName }
|
||||
});
|
||||
|
||||
const events: AgentSessionEvent[] = [];
|
||||
|
||||
await agent.handleMessage(
|
||||
{
|
||||
id: '1',
|
||||
channelId: 'test-channel',
|
||||
timestamp: new Date().toISOString(),
|
||||
sender: { id: 'u1', username: 'testuser', isBot: false },
|
||||
text: 'hello',
|
||||
attachments: [],
|
||||
isMention: true,
|
||||
},
|
||||
{ adapter: 'test', users: [], channels: [] },
|
||||
async (event) => { events.push(event); }
|
||||
);
|
||||
|
||||
const messageEnds = events.filter(e => e.type === 'message_end');
|
||||
expect(messageEnds.length).toBeGreaterThan(0);
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### 2. Adapter Tests (no agent)
|
||||
|
||||
```typescript
|
||||
// test/adapters/slack.test.ts
|
||||
describe('SlackAdapter', () => {
|
||||
it('converts Slack event to ChannelMessage', () => {
|
||||
const slackEvent = {
|
||||
type: 'message',
|
||||
text: 'Hello <@U123>',
|
||||
user: 'U456',
|
||||
channel: 'C789',
|
||||
ts: '1234567890.123456',
|
||||
};
|
||||
|
||||
const message = SlackAdapter.parseEvent(slackEvent, userCache);
|
||||
|
||||
expect(message.text).toBe('Hello @someuser');
|
||||
expect(message.channelId).toBe('C789');
|
||||
expect(message.sender.id).toBe('U456');
|
||||
});
|
||||
|
||||
it('converts markdown to Slack format', () => {
|
||||
const slack = SlackAdapter.toSlackFormat('**bold** and [link](http://example.com)');
|
||||
expect(slack).toBe('*bold* and <http://example.com|link>');
|
||||
});
|
||||
|
||||
it('handles message_end event', async () => {
|
||||
const mockClient = new MockSlackClient();
|
||||
const adapter = new SlackAdapter({ client: mockClient });
|
||||
|
||||
await adapter.handleEvent({
|
||||
type: 'message_end',
|
||||
message: { role: 'assistant', content: [{ type: 'text', text: '**Hello**' }] }
|
||||
}, channelContext);
|
||||
|
||||
// Verify Slack formatting applied
|
||||
expect(mockClient.postMessage).toHaveBeenCalledWith('C123', '*Hello*');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
### 3. Integration Tests
|
||||
|
||||
```typescript
|
||||
// test/integration.test.ts
|
||||
describe('Mom Integration', () => {
|
||||
let containerName: string;
|
||||
|
||||
beforeAll(async () => {
|
||||
containerName = await createTestContainer();
|
||||
});
|
||||
|
||||
afterAll(async () => {
|
||||
await destroyTestContainer(containerName);
|
||||
});
|
||||
|
||||
it('end-to-end with CLI adapter', async () => {
|
||||
const agent = new MomAgent({
|
||||
workDir: tmpDir,
|
||||
sandbox: { type: 'docker', container: containerName }
|
||||
});
|
||||
const adapter = new CLIAdapter({ agent, input: mockStdin, output: mockStdout });
|
||||
|
||||
await adapter.start();
|
||||
mockStdin.emit('data', 'Hello mom\n');
|
||||
|
||||
await waitFor(() => mockStdout.data.length > 0);
|
||||
expect(mockStdout.data).toContain('Hello');
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
## Migration Path
|
||||
|
||||
1. **Phase 1: Refactor storage** (non-breaking)
|
||||
- Unify log.jsonl schema (ChannelMessage format)
|
||||
- Add migration for existing Slack-format logs
|
||||
|
||||
2. **Phase 2: Extract adapter interface** (non-breaking)
|
||||
- Create SlackAdapter wrapping current SlackBot
|
||||
- Agent emits events, adapter handles UI
|
||||
|
||||
3. **Phase 3: Decouple agent** (non-breaking)
|
||||
- Remove Slack-specific code from agent.ts
|
||||
- Agent becomes fully platform-agnostic
|
||||
|
||||
4. **Phase 4: Add Discord** (new feature)
|
||||
- Implement DiscordAdapter
|
||||
- Share all storage and agent code
|
||||
|
||||
## Decisions
|
||||
|
||||
1. **Channel ID collision**: Prefix with adapter name (`channels/slack-acme/C123/`).
|
||||
|
||||
2. **Threads**: Adapter decides. Slack uses threads, Discord can use threads or embeds.
|
||||
|
||||
3. **Mentions**: Store as-is from platform. Agent outputs `@username`, adapter converts.
|
||||
|
||||
4. **Rate limiting**: Each adapter handles its own.
|
||||
|
||||
5. **Config**: Single `config.json` with all adapter configs and tokens.
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
packages/mom/src/
|
||||
├── main.ts # CLI entry point
|
||||
├── agent.ts # MomAgent
|
||||
├── store.ts # ChannelStore
|
||||
├── context.ts # Session management
|
||||
├── sandbox.ts # Sandbox execution
|
||||
├── events.ts # Scheduled events
|
||||
├── log.ts # Console logging
|
||||
│
|
||||
├── adapters/
|
||||
│ ├── types.ts # PlatformAdapter, ChannelMessage interfaces
|
||||
│ ├── slack.ts # SlackAdapter
|
||||
│ ├── discord.ts # DiscordAdapter
|
||||
│ └── cli.ts # CLIAdapter (for testing)
|
||||
│
|
||||
└── tools/
|
||||
├── index.ts
|
||||
├── bash.ts
|
||||
├── read.ts
|
||||
├── write.ts
|
||||
├── edit.ts
|
||||
└── attach.ts
|
||||
```
|
||||
|
||||
## Custom Tools (Host-Side Execution)
|
||||
|
||||
Mom runs bash commands inside a sandbox (Docker container), but sometimes you need tools that run on the host machine (e.g., accessing host APIs, credentials, or services that can't run in the container).
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────────────┐
|
||||
│ Host Machine │
|
||||
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Mom Process (Node.js) │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────────┐│ │
|
||||
│ │ │ CustomTool │ │ CustomTool │ │ invoke_tool (AgentTool) ││ │
|
||||
│ │ │ gmail │ │ calendar │ │ - receives tool name + args ││ │
|
||||
│ │ │ (loaded via │ │ (loaded via │ │ - dispatches to custom tool ││ │
|
||||
│ │ │ jiti) │ │ jiti) │ │ - returns result to agent ││ │
|
||||
│ │ └─────────────┘ └─────────────┘ └─────────────────────────────┘│ │
|
||||
│ │ ▲ │ │ │
|
||||
│ │ │ execute() │ invoke_tool() │ │
|
||||
│ │ │ ▼ │ │
|
||||
│ │ ┌───────────────────────────────────────────────────────────────┐│ │
|
||||
│ │ │ MomAgent ││ │
|
||||
│ │ │ - System prompt describes all custom tools ││ │
|
||||
│ │ │ - Has invoke_tool as one of its tools ││ │
|
||||
│ │ │ - Mom calls invoke_tool("gmail", {action: "search", ...}) ││ │
|
||||
│ │ └───────────────────────────────────────────────────────────────┘│ │
|
||||
│ └───────────────────────────────────────────────────────────────────┘ │
|
||||
│ │ │
|
||||
│ │ bash tool (Docker exec) │
|
||||
│ ▼ │
|
||||
│ ┌───────────────────────────────────────────────────────────────────┐ │
|
||||
│ │ Docker Container (Sandbox) │ │
|
||||
│ │ - Mom's bash commands run here │ │
|
||||
│ │ - Isolated from host (except mounted workspace) │ │
|
||||
│ └───────────────────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Custom Tool Interface
|
||||
|
||||
```typescript
|
||||
// data/tools/gmail/index.ts
|
||||
import type { MomCustomTool, ToolAPI } from "@mariozechner/pi-mom";
|
||||
import { Type } from "@sinclair/typebox";
|
||||
import { StringEnum } from "@mariozechner/pi-ai";
|
||||
|
||||
const tool: MomCustomTool = {
|
||||
name: "gmail",
|
||||
description: "Search, read, and send emails via Gmail",
|
||||
parameters: Type.Object({
|
||||
action: StringEnum(["search", "read", "send"]),
|
||||
query: Type.Optional(Type.String({ description: "Search query" })),
|
||||
messageId: Type.Optional(Type.String({ description: "Message ID to read" })),
|
||||
to: Type.Optional(Type.String({ description: "Recipient email" })),
|
||||
subject: Type.Optional(Type.String({ description: "Email subject" })),
|
||||
body: Type.Optional(Type.String({ description: "Email body" })),
|
||||
}),
|
||||
|
||||
async execute(toolCallId, params, signal) {
|
||||
switch (params.action) {
|
||||
case "search":
|
||||
const results = await searchEmails(params.query);
|
||||
return {
|
||||
content: [{ type: "text", text: formatSearchResults(results) }],
|
||||
details: { count: results.length },
|
||||
};
|
||||
case "read":
|
||||
const email = await readEmail(params.messageId);
|
||||
return {
|
||||
content: [{ type: "text", text: email.body }],
|
||||
details: { from: email.from, subject: email.subject },
|
||||
};
|
||||
case "send":
|
||||
await sendEmail(params.to, params.subject, params.body);
|
||||
return {
|
||||
content: [{ type: "text", text: `Email sent to ${params.to}` }],
|
||||
details: { sent: true },
|
||||
};
|
||||
}
|
||||
},
|
||||
};
|
||||
|
||||
export default tool;
|
||||
```
|
||||
|
||||
### MomCustomTool Type
|
||||
|
||||
```typescript
|
||||
import type { TSchema, Static } from "@sinclair/typebox";
|
||||
|
||||
export interface MomToolResult<TDetails = any> {
|
||||
content: Array<{ type: "text"; text: string } | { type: "image"; data: string; mimeType: string }>;
|
||||
details?: TDetails;
|
||||
}
|
||||
|
||||
export interface MomCustomTool<TParams extends TSchema = TSchema, TDetails = any> {
|
||||
/** Tool name (must be unique) */
|
||||
name: string;
|
||||
|
||||
/** Human-readable description for system prompt */
|
||||
description: string;
|
||||
|
||||
/** TypeBox schema for parameters */
|
||||
parameters: TParams;
|
||||
|
||||
/** Execute the tool */
|
||||
execute: (
|
||||
toolCallId: string,
|
||||
params: Static<TParams>,
|
||||
signal?: AbortSignal,
|
||||
) => Promise<MomToolResult<TDetails>>;
|
||||
|
||||
/** Optional: called when mom starts (for initialization) */
|
||||
onStart?: () => Promise<void>;
|
||||
|
||||
/** Optional: called when mom stops (for cleanup) */
|
||||
onStop?: () => Promise<void>;
|
||||
}
|
||||
|
||||
/** Factory function for tools that need async initialization */
|
||||
export type MomCustomToolFactory = (api: ToolAPI) => MomCustomTool | Promise<MomCustomTool>;
|
||||
|
||||
export interface ToolAPI {
|
||||
/** Path to mom's data directory */
|
||||
dataDir: string;
|
||||
|
||||
/** Execute a command on the host (not in sandbox) */
|
||||
exec: (command: string, args: string[], options?: ExecOptions) => Promise<ExecResult>;
|
||||
|
||||
/** Read a file from the data directory */
|
||||
readFile: (path: string) => Promise<string>;
|
||||
|
||||
/** Write a file to the data directory */
|
||||
writeFile: (path: string, content: string) => Promise<void>;
|
||||
}
|
||||
```
|
||||
|
||||
### Tool Discovery and Loading
|
||||
|
||||
Tools are discovered from:
|
||||
1. `data/tools/**/index.ts` (workspace-local, recursive)
|
||||
2. `~/.pi/mom/tools/**/index.ts` (global, recursive)
|
||||
|
||||
```typescript
|
||||
// loader.ts
|
||||
import { createJiti } from "jiti";
|
||||
|
||||
interface LoadedTool {
|
||||
path: string;
|
||||
tool: MomCustomTool;
|
||||
}
|
||||
|
||||
async function loadCustomTools(dataDir: string): Promise<LoadedTool[]> {
|
||||
const tools: LoadedTool[] = [];
|
||||
const jiti = createJiti(import.meta.url, { alias: getAliases() });
|
||||
|
||||
// Discover tool directories
|
||||
const toolDirs = [
|
||||
path.join(dataDir, "tools"),
|
||||
path.join(os.homedir(), ".pi", "mom", "tools"),
|
||||
];
|
||||
|
||||
for (const dir of toolDirs) {
|
||||
if (!fs.existsSync(dir)) continue;
|
||||
|
||||
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
|
||||
if (!entry.isDirectory()) continue;
|
||||
|
||||
const indexPath = path.join(dir, entry.name, "index.ts");
|
||||
if (!fs.existsSync(indexPath)) continue;
|
||||
|
||||
try {
|
||||
const module = await jiti.import(indexPath, { default: true });
|
||||
const toolOrFactory = module as MomCustomTool | MomCustomToolFactory;
|
||||
|
||||
const tool = typeof toolOrFactory === "function"
|
||||
? await toolOrFactory(createToolAPI(dataDir))
|
||||
: toolOrFactory;
|
||||
|
||||
tools.push({ path: indexPath, tool });
|
||||
} catch (err) {
|
||||
console.error(`Failed to load tool from ${indexPath}:`, err);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return tools;
|
||||
}
|
||||
```
|
||||
|
||||
### The invoke_tool Agent Tool
|
||||
|
||||
Mom has a single `invoke_tool` tool that dispatches to custom tools:
|
||||
|
||||
```typescript
|
||||
import { Type } from "@sinclair/typebox";
|
||||
|
||||
function createInvokeToolTool(loadedTools: LoadedTool[]): AgentTool {
|
||||
const toolMap = new Map(loadedTools.map(t => [t.tool.name, t.tool]));
|
||||
|
||||
return {
|
||||
name: "invoke_tool",
|
||||
label: "Invoke Tool",
|
||||
description: "Invoke a custom tool running on the host machine",
|
||||
parameters: Type.Object({
|
||||
tool: Type.String({ description: "Name of the tool to invoke" }),
|
||||
args: Type.Any({ description: "Arguments to pass to the tool (tool-specific)" }),
|
||||
}),
|
||||
|
||||
async execute(toolCallId, params, signal) {
|
||||
const tool = toolMap.get(params.tool);
|
||||
if (!tool) {
|
||||
return {
|
||||
content: [{ type: "text", text: `Unknown tool: ${params.tool}` }],
|
||||
details: { error: true },
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
|
||||
try {
|
||||
// Validate args against tool's schema
|
||||
// (TypeBox validation here)
|
||||
|
||||
const result = await tool.execute(toolCallId, params.args, signal);
|
||||
return {
|
||||
content: result.content,
|
||||
details: { tool: params.tool, ...result.details },
|
||||
};
|
||||
} catch (err) {
|
||||
return {
|
||||
content: [{ type: "text", text: `Tool error: ${err.message}` }],
|
||||
details: { error: true, tool: params.tool },
|
||||
isError: true,
|
||||
};
|
||||
}
|
||||
},
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### System Prompt Integration
|
||||
|
||||
Custom tools are described in the system prompt so mom knows what's available:
|
||||
|
||||
```typescript
|
||||
function formatCustomToolsForPrompt(tools: LoadedTool[]): string {
|
||||
if (tools.length === 0) return "";
|
||||
|
||||
let section = `\n## Custom Tools (Host-Side)
|
||||
|
||||
These tools run on the host machine (not in your sandbox). Use the \`invoke_tool\` tool to call them.
|
||||
|
||||
`;
|
||||
|
||||
for (const { tool } of tools) {
|
||||
section += `### ${tool.name}
|
||||
${tool.description}
|
||||
|
||||
**Parameters:**
|
||||
\`\`\`json
|
||||
${JSON.stringify(schemaToSimpleJson(tool.parameters), null, 2)}
|
||||
\`\`\`
|
||||
|
||||
**Example:**
|
||||
\`\`\`
|
||||
invoke_tool(tool: "${tool.name}", args: { ... })
|
||||
\`\`\`
|
||||
|
||||
`;
|
||||
}
|
||||
|
||||
return section;
|
||||
}
|
||||
|
||||
// Convert TypeBox schema to simple JSON for display
|
||||
function schemaToSimpleJson(schema: TSchema): object {
|
||||
// Simplified schema representation for the LLM
|
||||
// ...
|
||||
}
|
||||
```
|
||||
|
||||
### Example: Gmail Tool
|
||||
|
||||
```typescript
|
||||
// data/tools/gmail/index.ts
|
||||
import type { MomCustomTool, ToolAPI } from "@mariozechner/pi-mom";
|
||||
import { Type } from "@sinclair/typebox";
|
||||
import { StringEnum } from "@mariozechner/pi-ai";
|
||||
import Imap from "imap";
|
||||
import nodemailer from "nodemailer";
|
||||
|
||||
export default async function(api: ToolAPI): Promise<MomCustomTool> {
|
||||
// Load credentials from data directory
|
||||
const credsPath = path.join(api.dataDir, "tools", "gmail", "credentials.json");
|
||||
const creds = JSON.parse(await api.readFile(credsPath));
|
||||
|
||||
return {
|
||||
name: "gmail",
|
||||
description: "Search, read, and send emails via Gmail. Requires credentials.json in the tool directory.",
|
||||
parameters: Type.Object({
|
||||
action: StringEnum(["search", "read", "send", "list"]),
|
||||
// ... other params
|
||||
}),
|
||||
|
||||
async execute(toolCallId, params, signal) {
|
||||
// Implementation using imap/nodemailer
|
||||
},
|
||||
};
|
||||
}
|
||||
```
|
||||
|
||||
### Security Considerations
|
||||
|
||||
1. **Tools run on host**: Custom tools have full host access. Only install trusted tools.
|
||||
2. **Credential storage**: Tools should store credentials in the data directory, not in code.
|
||||
3. **Sandbox separation**: The sandbox (Docker) can't access host tools directly. Only mom's invoke_tool can call them.
|
||||
|
||||
### Loading
|
||||
|
||||
Tools are loaded via jiti. They can import any 3rd party dependencies (install in the tool directory). Imports of `@mariozechner/pi-ai` and `@mariozechner/pi-mom` are aliased to the running mom bundle.
|
||||
|
||||
**Live reload**: In dev mode, tools are watched and reloaded on change. No restart needed.
|
||||
|
||||
## Events System
|
||||
|
||||
Scheduled wake-ups via JSON files in `workspace/events/`.
|
||||
|
||||
### Format
|
||||
|
||||
```json
|
||||
{"type": "one-shot", "channelId": "slack-acme/C123ABC", "text": "Reminder", "at": "2025-12-15T09:00:00+01:00"}
|
||||
```
|
||||
|
||||
Channel ID is qualified with adapter name so the event watcher knows which adapter to use.
|
||||
|
||||
### Running
|
||||
|
||||
```bash
|
||||
mom ./data
|
||||
```
|
||||
|
||||
Reads `config.json`, starts all adapters defined there.
|
||||
|
||||
The shared workspace allows:
|
||||
- Shared MEMORY.md (global knowledge)
|
||||
- Shared skills
|
||||
- Events can target any platform
|
||||
- Per-channel data is still isolated by channel ID
|
||||
|
||||
## Summary
|
||||
|
||||
The key insight is **separation of concerns**:
|
||||
|
||||
1. **Storage**: Unified schema, messages stored as-is from platform
|
||||
2. **Agent**: Doesn't know about Slack/Discord, just processes messages and emits events
|
||||
3. **Adapters**: Handle platform-specific connection, formatting, and message splitting
|
||||
4. **Progress Rendering**: Each adapter decides how to display tool progress and results
|
||||
|
||||
This allows:
|
||||
- Testing agent without any platform
|
||||
- Testing adapters without agent
|
||||
- Adding new platforms by implementing `PlatformAdapter`
|
||||
- Sharing all storage, context management, and agent logic
|
||||
- Rich UI on platforms that support it (embeds, buttons)
|
||||
- Graceful degradation on simpler platforms (plain text)
|
||||
Loading…
Add table
Add a link
Reference in a new issue