# Clawdbot Browser Automation Architecture ## 1. Libraries Used * **Playwright (`playwright-core`)**: The primary engine for browser automation. Used for session management, page interactions (click, type, scroll), and capturing snapshots. * **Chromium**: The underlying browser instance managed by Playwright. * **Chrome DevTools Protocol (CDP)**: Used internally by Playwright and for specific low-level control where needed. * **Express**: Hosts the "Browser Bridge" server (`src/browser/bridge-server.ts`), exposing a REST API for the agent to control the browser instance. ## 2. Command Construction & Execution The AI controls the browser through a structured tool definition and dispatch pipeline: 1. **Tool Definition**: The `browser` tool is defined in `src/agents/tools/browser-tool.schema.ts` (using TypeBox). It exposes high-level actions like `navigate`, `act` (click/type/etc.), `snapshot`, and `screenshot`. 2. **Tool Invocation**: The AI calls the tool with specific parameters (e.g., `action="act"`, `request={ kind: "click", ref: "42" }`). 3. **Client Dispatch**: * The tool implementation (`src/agents/tools/browser-tool.ts`) handles the request. * It determines the target: **Host** (local) or **Node** (remote/sandbox). * For **local execution**, it calls client functions in `src/browser/client-actions-core.ts`. 4. **Bridge Request**: The client sends an HTTP POST request to the local Bridge Server (e.g., `POST /act`). 5. **Execution**: The Bridge Server receives the request and executes the corresponding Playwright command (e.g., `page.click()`) on the active browser page. ## 3. "Device Nodes" Architecture Clawdbot uses a distributed architecture to control browsers across different environments (local machine, Docker sandbox, or remote devices). * **Node Registry**: `src/gateway/node-registry.ts` tracks connected nodes and their capabilities (e.g., `caps: ["browser"]`). * **Proxying**: * When the AI targets a remote node (e.g., `target="node"`), the tool uses `callBrowserProxy`. * This sends a `node.invoke` event to the Gateway with the command `browser.proxy`. * The Gateway forwards this event to the target node via WebSocket. * **Node Execution**: * The target node (running `src/node-host/runner.ts`) receives the `node.invoke` event. * It handles the `browser.proxy` command by dispatching it to its *own* local browser control service. * This effectively allows any running Clawdbot instance to act as a remote browser driver for the main agent.