openclaw/docs/concepts/architecture.md
Davendra Patel d9851627b2 docs: add 29 inline Mermaid diagrams across documentation
Add visual Mermaid diagrams to supplement existing text descriptions
throughout docs/. Diagrams cover architecture, message flows, agent
lifecycle, routing, queue modes, security layers, plugin discovery,
tool groups, session lifecycle, and onboarding flows. No existing
content removed or altered.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 16:02:09 +05:30

5.3 KiB
Raw Blame History

summary read_when
WebSocket gateway architecture, components, and client flows
Working on gateway protocol, clients, or transports

Gateway architecture

Last updated: 2026-01-22

Overview

  • A single longlived Gateway owns all messaging surfaces (WhatsApp via Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
  • Control-plane clients (macOS app, CLI, web UI, automations) connect to the Gateway over WebSocket on the configured bind host (default 127.0.0.1:18789).
  • Nodes (macOS/iOS/Android/headless) also connect over WebSocket, but declare role: node with explicit caps/commands.
  • One Gateway per host; it is the only place that opens a WhatsApp session.
  • A canvas host (default 18793) serves agenteditable HTML and A2UI.

Components and flows

graph TD
    subgraph Channels
        WA[WhatsApp/Baileys]
        TG[Telegram/grammY]
        SL[Slack]
        DC[Discord]
        SIG[Signal]
        IM[iMessage]
        WC[WebChat]
    end

    subgraph Clients
        CLI[CLI]
        MAC[macOS App]
        WEB[Web UI]
        AUTO[Automations]
    end

    subgraph Nodes
        MACOS_N[macOS Node]
        IOS_N[iOS Node]
        AND_N[Android Node]
        HEAD_N[Headless Node]
    end

    GW[Gateway\nWebSocket Server\n127.0.0.1:18789]

    WA & TG & SL & DC & SIG & IM & WC --> GW
    CLI & MAC & WEB & AUTO -->|WS: role=operator| GW
    MACOS_N & IOS_N & AND_N & HEAD_N -->|WS: role=node| GW

    GW --> AGENT[Agent Runtime\npi-agent-core]
    AGENT --> TOOLS[Tools\nbrowser, exec, canvas,\nmessage, cron, nodes]

    CANVAS[Canvas Host\n:18793]
    GW -.-> CANVAS

Gateway (daemon)

  • Maintains provider connections.
  • Exposes a typed WS API (requests, responses, serverpush events).
  • Validates inbound frames against JSON Schema.
  • Emits events like agent, chat, presence, health, heartbeat, cron.

Clients (mac app / CLI / web admin)

  • One WS connection per client.
  • Send requests (health, status, send, agent, system-presence).
  • Subscribe to events (tick, agent, presence, shutdown).

Nodes (macOS / iOS / Android / headless)

  • Connect to the same WS server with role: node.
  • Provide a device identity in connect; pairing is devicebased (role node) and approval lives in the device pairing store.
  • Expose commands like canvas.*, camera.*, screen.record, location.get.

Protocol details:

WebChat

  • Static UI that uses the Gateway WS API for chat history and sends.
  • In remote setups, connects through the same SSH/Tailscale tunnel as other clients.

Connection lifecycle (single client)

Client                    Gateway
  |                          |
  |---- req:connect -------->|
  |<------ res (ok) ---------|   (or res error + close)
  |   (payload=hello-ok carries snapshot: presence + health)
  |                          |
  |<------ event:presence ---|
  |<------ event:tick -------|
  |                          |
  |------- req:agent ------->|
  |<------ res:agent --------|   (ack: {runId,status:"accepted"})
  |<------ event:agent ------|   (streaming)
  |<------ res:agent --------|   (final: {runId,status,summary})
  |                          |

Wire protocol (summary)

  • Transport: WebSocket, text frames with JSON payloads.
  • First frame must be connect.
  • After handshake:
    • Requests: {type:"req", id, method, params}{type:"res", id, ok, payload|error}
    • Events: {type:"event", event, payload, seq?, stateVersion?}
  • If CLAWDBOT_GATEWAY_TOKEN (or --token) is set, connect.params.auth.token must match or the socket closes.
  • Idempotency keys are required for sideeffecting methods (send, agent) to safely retry; the server keeps a shortlived dedupe cache.
  • Nodes must include role: "node" plus caps/commands/permissions in connect.

Pairing + local trust

  • All WS clients (operators + nodes) include a device identity on connect.
  • New device IDs require pairing approval; the Gateway issues a device token for subsequent connects.
  • Local connects (loopback or the gateway hosts own tailnet address) can be autoapproved to keep samehost UX smooth.
  • Nonlocal connects must sign the connect.challenge nonce and require explicit approval.
  • Gateway auth (gateway.auth.*) still applies to all connections, local or remote.

Details: Gateway protocol, Pairing, Security.

Protocol typing and codegen

  • TypeBox schemas define the protocol.
  • JSON Schema is generated from those schemas.
  • Swift models are generated from the JSON Schema.

Remote access

  • Preferred: Tailscale or VPN.
  • Alternative: SSH tunnel
    ssh -N -L 18789:127.0.0.1:18789 user@host
    
  • The same handshake + auth token apply over the tunnel.
  • TLS + optional pinning can be enabled for WS in remote setups.

Operations snapshot

  • Start: moltbot gateway (foreground, logs to stdout).
  • Health: health over WS (also included in hello-ok).
  • Supervision: launchd/systemd for autorestart.

Invariants

  • Exactly one Gateway controls a single Baileys session per host.
  • Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
  • Events are not replayed; clients must refresh on gaps.