# Moltbot Architecture Documentation ## Table of Contents 1. [Overview](#overview) 2. [High-Level Architecture](#high-level-architecture) 3. [Core Components](#core-components) 4. [Data Flow](#data-flow) 5. [Message Routing](#message-routing) 6. [Agent Execution Model](#agent-execution-model) 7. [Plugin Architecture](#plugin-architecture) 8. [Security Model](#security-model) 9. [Storage & Persistence](#storage--persistence) 10. [Network Architecture](#network-architecture) --- ## Overview Moltbot is designed as a **local-first, extensible AI assistant platform** with a microkernel-inspired architecture. The system consists of a central Gateway (control plane) and modular components (channels, tools, plugins) that communicate through well-defined interfaces. ### Design Principles 1. **Local-First**: Everything runs on user's devices; no central server dependency 2. **Privacy-Focused**: User data never leaves their control 3. **Extensible**: Plugin system for adding channels, tools, and functionality 4. **Multi-Platform**: Works on macOS, iOS, Android, Linux, Windows (WSL2) 5. **Resilient**: Graceful degradation; continues working when components fail 6. **Developer-Friendly**: Clear APIs, TypeScript types, comprehensive testing --- ## High-Level Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ User Interfaces │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────────────┐ │ │ │ macOS │ │ iOS │ │ Android │ │ CLI / Web UI │ │ │ │ App │ │ App │ │ App │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ └─────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ↓ WebSocket / HTTP ┌─────────────────────────────────────────────────────────────────┐ │ Gateway Server │ │ ┌─────────────────────────────────────────────────────────────┐│ │ │ WebSocket Control Plane ││ │ │ • Session Management ││ │ │ • Message Routing ││ │ │ • Model Catalog ││ │ │ • Discovery Service (mDNS) ││ │ └─────────────────────────────────────────────────────────────┘│ │ │ │ │ ┌──────────────┬───────────┴───────────┬──────────────────┐ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌──────┐ ┌──────────┐ ┌───────────┐ ┌─────────────────┐ │ │ │ Chat │ │ Model │ │ Browser │ │ Cron Scheduler │ │ │ │Registry│ │ Catalog │ │ Tool │ │ │ │ │ └──────┘ └──────────┘ └───────────┘ └─────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ↓ ┌─────────────────────────────────────────────────────────────────┐ │ Agent Runtime Layer │ │ ┌─────────────────────────────────────────────────────────────┐│ │ │ Pi Agent (RPC Mode) ││ │ │ • Tool Streaming ││ │ │ • Block Streaming ││ │ │ • Model Failover ││ │ │ • Auth Profile Rotation ││ │ │ • Session Context Management ││ │ └─────────────────────────────────────────────────────────────┘│ │ │ │ │ ┌──────────────┬───────────┴───────────┬──────────────────┐ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ │ │ ┌──────┐ ┌──────────┐ ┌───────────┐ ┌─────────────────┐ │ │ │ Tool │ │ Session │ │ Memory │ │ Sandbox │ │ │ │Registry│ │ Manager │ │ Store │ │ (Optional) │ │ │ └──────┘ └──────────┘ └───────────┘ └─────────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ↓ ┌─────────────────────────────────────────────────────────────────┐ │ Channel Layer (Multi-Protocol) │ │ ┌──────────┬──────────┬──────────┬──────────┬──────────────┐ │ │ │ WhatsApp │ Telegram │ Signal │ Discord │ Slack │ │ │ └──────────┴──────────┴──────────┴──────────┴──────────────┘ │ │ ┌──────────┬──────────┬──────────┬──────────┬──────────────┐ │ │ │ iMessage │ Teams │ Matrix │ Zalo │ BlueBubbles │ │ │ └──────────┴──────────┴──────────┴──────────┴──────────────┘ │ │ ┌──────────┬──────────┬──────────┬──────────┬──────────────┐ │ │ │ Twitch │Mattermost│ LINE │ Nostr │ Voice Call │ │ │ └──────────┴──────────┴──────────┴──────────┴──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ │ ↓ ┌─────────────────────────────────────────────────────────────────┐ │ External Services │ │ ┌──────────┬──────────┬──────────┬──────────┬──────────────┐ │ │ │ Anthropic│ OpenAI │ AWS │ Ollama │ Google │ │ │ │ Claude │ GPT │ Bedrock │ (Local) │ Gemini │ │ │ └──────────┴──────────┴──────────┴──────────┴──────────────┘ │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## Core Components ### 1. Gateway Server **Responsibility**: Central control plane for all communication and orchestration. **Key Modules**: - `server.ts` - Main WebSocket server - `chat-registry.ts` - Active chat session registry - `model-catalog.ts` - Available AI models catalog - `browser-tool.ts` - Shared browser automation tool - `discovery.ts` - mDNS service discovery **APIs**: ```typescript interface GatewayServer { start(): Promise; stop(): Promise; registerChat(chat: ChatSession): void; unregisterChat(chatId: string): void; routeMessage(message: InboundMessage): Promise; getModelCatalog(): ModelCatalog; getBrowserTool(): BrowserTool; } ``` **Communication**: - WebSocket (ws://) for real-time bidirectional communication - HTTP REST API for management operations - mDNS for local network discovery ### 2. Agent Runtime **Responsibility**: Execute AI models and manage agent sessions. **Key Modules**: - `pi-embedded-runner.ts` - Pi Agent executor - `session-manager.ts` - Session lifecycle management - `tool-registry.ts` - Available tools catalog - `model-auth.ts` - Model authentication and failover **Execution Flow**: ```typescript interface AgentRuntime { createSession(config: SessionConfig): AgentSession; executeMessage(session: AgentSession, message: string): AsyncGenerator; executeTools(session: AgentSession, tools: ToolCall[]): Promise; streamResponse(session: AgentSession): AsyncGenerator; } ``` **Session Management**: - **Main Session**: Primary agent for user interactions - **Group Sessions**: Isolated agents for group chats - **Sub-Sessions**: Spawned agents for specific tasks - **Queue Modes**: Serial (one at a time) vs Concurrent (parallel) ### 3. Channel Layer **Responsibility**: Handle messaging protocol specifics and normalize messages. **Channel Interface**: ```typescript interface Channel { name: string; type: ChannelType; // Lifecycle init(config: ChannelConfig): Promise; start(): Promise; stop(): Promise; // Messaging send(message: OutboundMessage): Promise; receive(handler: MessageHandler): void; // Status getStatus(): ChannelStatus; getConnections(): Connection[]; } ``` **Message Normalization**: ```typescript interface NormalizedMessage { id: string; channel: string; from: string; to: string; text: string; media?: MediaAttachment[]; replyTo?: string; timestamp: number; metadata: Record; } ``` **Channel Types**: - **Built-in**: WhatsApp, Telegram, Signal, Discord, Slack, iMessage - **Extensions**: Teams, Matrix, Zalo, BlueBubbles, Twitch, etc. ### 4. Tool System **Responsibility**: Provide capabilities to agents beyond text generation. **Tool Interface**: ```typescript interface Tool { name: string; description: string; schema: JSONSchema; execute(params: unknown, context: ToolContext): Promise; } ``` **Built-in Tools**: - **Browser Tool**: Web automation, screenshots, form filling - **Canvas Tool**: Visual workspace with A2UI - **Image Tool**: Image analysis and generation - **Message Tool**: Send messages across channels - **Session Tool**: Spawn sub-agents - **Bash Tool**: Execute shell commands - **Web Tool**: Fetch URLs, web search, readability - **Memory Tool**: Store and retrieve context - **Cron Tool**: Schedule tasks **Tool Execution Pipeline**: ``` User Message → Agent → Tool Call → Tool Execute → Tool Result → Agent → Response ``` ### 5. Plugin System **Responsibility**: Enable extensibility through modular plugins. **Plugin Types**: 1. **Channel Plugins**: Add new messaging integrations 2. **Tool Plugins**: Add custom agent tools 3. **Auth Plugins**: Add OAuth handlers 4. **Memory Plugins**: Add storage backends 5. **Hook Plugins**: Add global event handlers **Plugin Lifecycle**: ```typescript interface Plugin { manifest: PluginManifest; // Lifecycle init(services: PluginServices): Promise; start(): Promise; stop(): Promise; // Hooks hooks?: Record; } ``` **Plugin Discovery**: - Auto-detect from `extensions/` directory - Auto-detect from `node_modules/@moltbot/` packages - Validate manifest and config schema - Load and initialize plugins --- ## Data Flow ### Inbound Message Flow ``` External Channel (WhatsApp, Telegram, etc.) ↓ Channel Handler ↓ Message Normalization ↓ Allowlist Check ↓ Gateway Router ↓ Session Selection ↓ Agent Execution ↓ Tool Streaming ↓ Response Generation ↓ Channel Formatting ↓ Outbound Message ↓ External Channel ``` ### Detailed Flow Steps 1. **Channel Receives Message** - Raw message from external protocol - Protocol-specific handling (e.g., Baileys for WhatsApp) 2. **Message Normalization** - Convert to `NormalizedMessage` format - Extract text, media, reply context - Add metadata 3. **Allowlist Check** - Verify sender is allowed - Check pairing status for DMs - Validate group membership 4. **Gateway Routing** - Determine target session - Check activation mode (mention, reply, always-on) - Queue or execute immediately 5. **Agent Execution** - Load session context - Execute model with tools - Stream tool calls and responses 6. **Tool Execution** - Execute tools as needed - Stream tool results back to agent - Handle tool errors gracefully 7. **Response Generation** - Agent generates response - Apply thinking mode (low/high/always) - Format for target channel 8. **Channel Formatting** - Convert to channel-specific format - Chunk long messages if needed - Add reactions/typing indicators 9. **Outbound Message** - Send via channel handler - Track delivery status - Handle errors and retries --- ## Message Routing ### Routing Rules ```typescript interface RoutingConfig { // Main session routing mainSession: { activationMode: 'always' | 'mention' | 'reply' | 'dm-only'; replyBack: boolean; allowedChannels: string[]; }; // Group session routing groupSessions: { enabled: boolean; isolateByGroup: boolean; activationMode: 'mention' | 'reply'; }; // Cross-channel routing crossChannel: { enabled: boolean; routes: ChannelRoute[]; }; } ``` ### Activation Modes 1. **Always-On**: Every message triggers agent 2. **Mention**: Only messages mentioning bot trigger agent 3. **Reply**: Only replies to bot messages trigger agent 4. **DM-Only**: Only direct messages trigger agent ### Session Isolation - **Main Session**: Primary agent for user - **Group Session**: Separate agent per group chat - **Workspace Session**: Separate agent per workspace --- ## Agent Execution Model ### Session Context ```typescript interface SessionContext { sessionId: string; chatId: string; channel: string; // Context window messages: Message[]; maxTokens: number; // Configuration model: string; temperature: number; thinkingMode: 'low' | 'high' | 'always'; // Tools availableTools: Tool[]; toolResults: ToolResult[]; // Memory memory: MemoryStore; skills: Skill[]; } ``` ### Execution Pipeline ``` Message → Load Context → Execute Model → Stream Tools → Generate Response → Save Context ``` ### Tool Streaming ```typescript interface ToolStreamEvent { type: 'tool_call' | 'tool_result' | 'text_chunk' | 'thinking'; data: unknown; } async function* streamAgentExecution( session: SessionContext, message: string ): AsyncGenerator { // Stream tool calls and results in real-time yield* piAgent.execute(session, message); } ``` ### Model Failover ```typescript interface ModelFailoverConfig { primary: ModelConfig; fallbacks: ModelConfig[]; retryAttempts: number; retryDelay: number; } async function executeWithFailover( config: ModelFailoverConfig, request: ModelRequest ): Promise { // Try primary model try { return await execute(config.primary, request); } catch (error) { // Try fallback models for (const fallback of config.fallbacks) { try { return await execute(fallback, request); } catch (fallbackError) { continue; } } throw error; } } ``` --- ## Plugin Architecture ### Plugin System Design ``` ┌──────────────────────────────────────┐ │ Plugin Discovery │ │ • extensions/ │ │ • node_modules/@moltbot/ │ └──────────────────────────────────────┘ │ ↓ ┌──────────────────────────────────────┐ │ Plugin Loader │ │ • Validate manifest │ │ • Load config schema │ │ • Initialize plugin │ └──────────────────────────────────────┘ │ ↓ ┌──────────────────────────────────────┐ │ Plugin Runtime │ │ • Register hooks │ │ • Inject services │ │ • Execute plugin logic │ └──────────────────────────────────────┘ ``` ### Plugin Manifest ```json { "name": "my-plugin", "version": "1.0.0", "type": "channel", "exports": { ".": "./dist/index.js", "./config-schema": "./dist/config-schema.js" }, "hooks": ["channel", "message:received", "message:sent"], "permissions": ["network", "fs", "env"], "dependencies": { "some-lib": "^1.0.0" } } ``` ### Plugin Services ```typescript interface PluginServices { gateway: GatewayClient; config: ConfigProvider; logger: Logger; http: HttpClient; storage: StorageProvider; } ``` ### Plugin Hooks ```typescript interface PluginHooks { 'plugin:init': (plugin: Plugin) => Promise; 'plugin:start': (plugin: Plugin) => Promise; 'plugin:stop': (plugin: Plugin) => Promise; 'message:received': (message: NormalizedMessage) => Promise; 'message:sent': (message: NormalizedMessage) => Promise; 'agent:before': (session: SessionContext) => Promise; 'agent:after': (session: SessionContext, response: string) => Promise; 'tool:before': (tool: Tool, params: unknown) => Promise; 'tool:after': (tool: Tool, result: ToolResult) => Promise; } ``` --- ## Security Model ### Authentication 1. **Pairing System** - Code-based pairing for DMs - One-time pairing codes - Allowlist management 2. **Channel Authentication** - OAuth tokens (Discord, Slack, etc.) - Bot tokens (Telegram) - Session persistence (WhatsApp, Signal) 3. **Model Authentication** - API keys (Anthropic, OpenAI) - OAuth (Google, AWS) - Profile rotation on failure ### Authorization 1. **Allowlists** - Per-channel allowlists - Phone numbers for WhatsApp - User IDs for Telegram - Handle-based for Discord 2. **Group Policies** - Mention-only mode - Reply-only mode - Admin-only commands 3. **Tool Permissions** - Per-tool enable/disable - Execution timeouts - Resource limits ### Privacy 1. **Local-First** - All data stored locally - No external server dependency - User controls all data 2. **Session Isolation** - Separate sessions per chat - No cross-session data leaks - Clear session boundaries 3. **Media Handling** - Temporary storage with lifecycle - Automatic cleanup - Optional media redaction --- ## Storage & Persistence ### File System Layout ``` ~/.clawdbot/ ├── config/ │ ├── .moltbot.yaml # Main config │ ├── models.json # Model profiles │ └── skills/ # Skill configs ├── credentials/ │ ├── anthropic.json # OAuth tokens │ ├── openai.json │ └── channels/ # Channel credentials ├── sessions/ │ ├── main/ # Main session data │ ├── groups/ # Group sessions │ └── agents/ # Agent-specific data ├── media/ │ ├── audio/ # Audio files │ ├── images/ # Image files │ └── videos/ # Video files ├── memory/ │ ├── vectors/ # Vector embeddings │ └── cache/ # Cached data └── logs/ ├── gateway.log # Gateway logs ├── agent.log # Agent logs └── channels/ # Per-channel logs ``` ### Session Storage ```typescript interface SessionStore { // Session CRUD create(session: SessionContext): Promise; read(sessionId: string): Promise; update(session: SessionContext): Promise; delete(sessionId: string): Promise; // Message history addMessage(sessionId: string, message: Message): Promise; getMessages(sessionId: string, limit: number): Promise; // Context management getContext(sessionId: string): Promise; saveContext(sessionId: string, context: SessionContext): Promise; } ``` ### Memory Store ```typescript interface MemoryStore { // Vector storage addEmbedding(key: string, embedding: number[]): Promise; searchEmbeddings(query: number[], k: number): Promise; // Key-value storage set(key: string, value: unknown): Promise; get(key: string): Promise; delete(key: string): Promise; // Context retrieval recall(query: string): Promise; } ``` --- ## Network Architecture ### Local Mode ``` ┌───────────────────────┐ │ User Device │ │ ┌─────────────────┐ │ │ │ Gateway Server │ │ │ │ (localhost) │ │ │ └─────────────────┘ │ │ ↓ │ │ ┌─────────────────┐ │ │ │ Agent Runtime │ │ │ └─────────────────┘ │ │ ↓ │ │ ┌─────────────────┐ │ │ │ Channels │ │ │ └─────────────────┘ │ └───────────────────────┘ ↓ External Services (Anthropic, OpenAI) ``` ### Remote Gateway Mode ``` ┌───────────────────────┐ ┌───────────────────────┐ │ Client Device │ │ Remote Server │ │ ┌─────────────────┐ │ WS │ ┌─────────────────┐ │ │ │ Gateway Client │──┼────────┼──│ Gateway Server │ │ │ │ (Control UI) │ │ │ │ (Headless) │ │ │ └─────────────────┘ │ │ └─────────────────┘ │ └───────────────────────┘ │ ↓ │ │ ┌─────────────────┐ │ │ │ Agent Runtime │ │ │ └─────────────────┘ │ │ ↓ │ │ ┌─────────────────┐ │ │ │ Channels │ │ │ └─────────────────┘ │ └───────────────────────┘ ↓ External Services ``` ### Service Discovery ```typescript interface DiscoveryService { // mDNS advertisement advertise(service: ServiceInfo): Promise; stopAdvertise(): Promise; // Discovery discover(serviceType: string): Promise; watch(serviceType: string, handler: ServiceHandler): void; } ``` --- ## Scalability Considerations ### Vertical Scaling - Single-user design (no multi-tenancy) - Resource limits per tool - Session cleanup policies - Media file size limits ### Horizontal Scaling - Not applicable (personal assistant) - Remote Gateway for headless servers - Multiple client devices connecting to single gateway ### Performance Optimizations - Message chunking for large responses - Lazy loading of plugins - Caching of model responses - Connection pooling for channels - Parallel tool execution (where safe) --- ## Reliability & Fault Tolerance ### Error Handling - Graceful degradation when components fail - Model failover on API errors - Channel reconnection on disconnect - Tool timeout handling ### Monitoring - Gateway status endpoint - Channel connection status - Agent execution metrics - Tool execution tracing ### Logging - Structured logging with tslog - Per-component log levels - Configurable log output (file, console) - Log rotation and retention --- ## Future Architecture Considerations ### Planned Enhancements - Distributed tracing - Performance profiling - Advanced memory systems (RAG, embeddings) - Multi-agent coordination - Enhanced sandbox isolation ### Extensibility Points - Custom model providers - Custom channel protocols - Custom tool implementations - Custom memory backends - Custom authentication providers --- **End of Architecture Documentation**