openclaw/docs/concepts/architecture.md
Davendra Patel 2e3e12f38b docs: add pre-rendered diagram PNGs and update AGENTS.md with architecture overview
Add 32 rendered PNG diagram images alongside existing Mermaid source
blocks (wrapped in collapsible details) across documentation pages.
Update AGENTS.md with architecture overview section and single-test
command. Update README hero banner to use rendered diagram.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 16:02:09 +05:30

5.5 KiB
Raw Blame History

summary read_when
WebSocket gateway architecture, components, and client flows
Working on gateway protocol, clients, or transports

Gateway architecture

Last updated: 2026-01-22

Overview

  • A single longlived Gateway owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
  • Control-plane clients (macOS app, CLI, web UI, automations) connect to the Gateway over WebSocket on the configured bind host (default 127.0.0.1:18789).
  • Nodes (macOS/iOS/Android/headless) also connect over WebSocket, but declare role: node with explicit caps/commands.
  • One Gateway per host; it is the only place that opens a WhatsApp session.
  • A canvas host (default 18793) serves agenteditable HTML and A2UI.

Components and flows

Architecture Overview

Diagram source (Mermaid)
graph TD
    subgraph Channels
        WA[WhatsApp/Baileys]
        TG[Telegram/grammY]
        SL[Slack]
        DC[Discord]
        SIG[Signal]
        IM[iMessage]
        WC[WebChat]
    end

    subgraph Clients
        CLI[CLI]
        MAC[macOS App]
        WEB[Web UI]
        AUTO[Automations]
    end

    subgraph Nodes
        MACOS_N[macOS Node]
        IOS_N[iOS Node]
        AND_N[Android Node]
        HEAD_N[Headless Node]
    end

    GW[Gateway\nWebSocket Server\n127.0.0.1:18789]

    WA & TG & SL & DC & SIG & IM & WC --> GW
    CLI & MAC & WEB & AUTO -->|WS: role=operator| GW
    MACOS_N & IOS_N & AND_N & HEAD_N -->|WS: role=node| GW

    GW --> AGENT[Agent Runtime\npi-agent-core]
    AGENT --> TOOLS[Tools\nbrowser, exec, canvas,\nmessage, cron, nodes]

    CANVAS[Canvas Host\n:18793]
    GW -.-> CANVAS

Gateway (daemon)

  • Maintains provider connections.
  • Exposes a typed WS API (requests, responses, serverpush events).
  • Validates inbound frames against JSON Schema.
  • Emits events like agent, chat, presence, health, heartbeat, cron.

Clients (mac app / CLI / web admin)

  • One WS connection per client.
  • Send requests (health, status, send, agent, system-presence).
  • Subscribe to events (tick, agent, presence, shutdown).

Nodes (macOS / iOS / Android / headless)

  • Connect to the same WS server with role: node.
  • Provide a device identity in connect; pairing is devicebased (role node) and approval lives in the device pairing store.
  • Expose commands like canvas.*, camera.*, screen.record, location.get.

Protocol details:

WebChat

  • Static UI that uses the Gateway WS API for chat history and sends.
  • In remote setups, connects through the same SSH/Tailscale tunnel as other clients.

Connection lifecycle (single client)

Client                    Gateway
  |                          |
  |---- req:connect -------->|
  |<------ res (ok) ---------|   (or res error + close)
  |   (payload=hello-ok carries snapshot: presence + health)
  |                          |
  |<------ event:presence ---|
  |<------ event:tick -------|
  |                          |
  |------- req:agent ------->|
  |<------ res:agent --------|   (ack: {runId,status:"accepted"})
  |<------ event:agent ------|   (streaming)
  |<------ res:agent --------|   (final: {runId,status,summary})
  |                          |

Wire protocol (summary)

  • Transport: WebSocket, text frames with JSON payloads.
  • First frame must be connect.
  • After handshake:
    • Requests: {type:"req", id, method, params}{type:"res", id, ok, payload|error}
    • Events: {type:"event", event, payload, seq?, stateVersion?}
  • If CLAWDBOT_GATEWAY_TOKEN (or --token) is set, connect.params.auth.token must match or the socket closes.
  • Idempotency keys are required for sideeffecting methods (send, agent) to safely retry; the server keeps a shortlived dedupe cache.
  • Nodes must include role: "node" plus caps/commands/permissions in connect.

Pairing + local trust

  • All WS clients (operators + nodes) include a device identity on connect.
  • New device IDs require pairing approval; the Gateway issues a device token for subsequent connects.
  • Local connects (loopback or the gateway hosts own tailnet address) can be autoapproved to keep samehost UX smooth.
  • Nonlocal connects must sign the connect.challenge nonce and require explicit approval.
  • Gateway auth (gateway.auth.*) still applies to all connections, local or remote.

Details: Gateway protocol, Pairing, Security.

Protocol typing and codegen

  • TypeBox schemas define the protocol.
  • JSON Schema is generated from those schemas.
  • Swift models are generated from the JSON Schema.

Remote access

  • Preferred: Tailscale or VPN.
  • Alternative: SSH tunnel
    ssh -N -L 18789:127.0.0.1:18789 user@host
    
  • The same handshake + auth token apply over the tunnel.
  • TLS + optional pinning can be enabled for WS in remote setups.

Operations snapshot

  • Start: moltbot gateway (foreground, logs to stdout).
  • Health: health over WS (also included in hello-ok).
  • Supervision: launchd/systemd for autorestart.

Invariants

  • Exactly one Gateway controls a single Baileys session per host.
  • Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
  • Events are not replayed; clients must refresh on gaps.