openclaw/docs/platforms/fly.md
Dan Guido aea8c678e3 docs(fly): add private/hardened deployment guide
- Add fly.private.toml template for deployments with no public IP
- Add "Private Deployment (Hardened)" section to Fly docs
- Document how to convert existing deployment to private-only
- Add security notes recommending env vars over config file for secrets

This addresses security concerns about Clawdbot gateways being
discoverable on internet scanners (Shodan, Censys). Private deployments
are accessible only via fly proxy, WireGuard, or SSH.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-26 16:50:02 +00:00

462 lines
11 KiB
Markdown

---
title: Fly.io
description: Deploy Clawdbot on Fly.io
---
# Fly.io Deployment
**Goal:** Clawdbot Gateway running on a [Fly.io](https://fly.io) machine with persistent storage, automatic HTTPS, and Discord/channel access.
## What you need
- [flyctl CLI](https://fly.io/docs/hands-on/install-flyctl/) installed
- Fly.io account (free tier works)
- Model auth: Anthropic API key (or other provider keys)
- Channel credentials: Discord bot token, Telegram token, etc.
## Beginner quick path
1. Clone repo → customize `fly.toml`
2. Create app + volume → set secrets
3. Deploy with `fly deploy`
4. SSH in to create config or use Control UI
## 1) Create the Fly app
```bash
# Clone the repo
git clone https://github.com/clawdbot/clawdbot.git
cd clawdbot
# Create a new Fly app (pick your own name)
fly apps create my-clawdbot
# Create a persistent volume (1GB is usually enough)
fly volumes create clawdbot_data --size 1 --region iad
```
**Tip:** Choose a region close to you. Common options: `lhr` (London), `iad` (Virginia), `sjc` (San Jose).
## 2) Configure fly.toml
Edit `fly.toml` to match your app name and requirements.
**Security note:** The default config exposes a public URL. For a hardened deployment with no public IP, see [Private Deployment](#private-deployment-hardened) or use `fly.private.toml`.
```toml
app = "my-clawdbot" # Your app name
primary_region = "iad"
[build]
dockerfile = "Dockerfile"
[env]
NODE_ENV = "production"
CLAWDBOT_PREFER_PNPM = "1"
CLAWDBOT_STATE_DIR = "/data"
NODE_OPTIONS = "--max-old-space-size=1536"
[processes]
app = "node dist/index.js gateway --allow-unconfigured --port 3000 --bind lan"
[http_service]
internal_port = 3000
force_https = true
auto_stop_machines = false
auto_start_machines = true
min_machines_running = 1
processes = ["app"]
[[vm]]
size = "shared-cpu-2x"
memory = "2048mb"
[mounts]
source = "clawdbot_data"
destination = "/data"
```
**Key settings:**
| Setting | Why |
|---------|-----|
| `--bind lan` | Binds to `0.0.0.0` so Fly's proxy can reach the gateway |
| `--allow-unconfigured` | Starts without a config file (you'll create one after) |
| `internal_port = 3000` | Must match `--port 3000` (or `CLAWDBOT_GATEWAY_PORT`) for Fly health checks |
| `memory = "2048mb"` | 512MB is too small; 2GB recommended |
| `CLAWDBOT_STATE_DIR = "/data"` | Persists state on the volume |
## 3) Set secrets
```bash
# Required: Gateway token (for non-loopback binding)
fly secrets set CLAWDBOT_GATEWAY_TOKEN=$(openssl rand -hex 32)
# Model provider API keys
fly secrets set ANTHROPIC_API_KEY=sk-ant-...
# Optional: Other providers
fly secrets set OPENAI_API_KEY=sk-...
fly secrets set GOOGLE_API_KEY=...
# Channel tokens
fly secrets set DISCORD_BOT_TOKEN=MTQ...
```
**Notes:**
- Non-loopback binds (`--bind lan`) require `CLAWDBOT_GATEWAY_TOKEN` for security.
- Treat these tokens like passwords.
- **Prefer env vars over config file** for all API keys and tokens. This keeps secrets out of `clawdbot.json` where they could be accidentally exposed or logged.
## 4) Deploy
```bash
fly deploy
```
First deploy builds the Docker image (~2-3 minutes). Subsequent deploys are faster.
After deployment, verify:
```bash
fly status
fly logs
```
You should see:
```
[gateway] listening on ws://0.0.0.0:3000 (PID xxx)
[discord] logged in to discord as xxx
```
## 5) Create config file
SSH into the machine to create a proper config:
```bash
fly ssh console
```
Create the config directory and file:
```bash
mkdir -p /data
cat > /data/clawdbot.json << 'EOF'
{
"agents": {
"defaults": {
"model": {
"primary": "anthropic/claude-opus-4-5",
"fallbacks": ["anthropic/claude-sonnet-4-5", "openai/gpt-4o"]
},
"maxConcurrent": 4
},
"list": [
{
"id": "main",
"default": true
}
]
},
"auth": {
"profiles": {
"anthropic:default": { "mode": "token", "provider": "anthropic" },
"openai:default": { "mode": "token", "provider": "openai" }
}
},
"bindings": [
{
"agentId": "main",
"match": { "channel": "discord" }
}
],
"channels": {
"discord": {
"enabled": true,
"groupPolicy": "allowlist",
"guilds": {
"YOUR_GUILD_ID": {
"channels": { "general": { "allow": true } },
"requireMention": false
}
}
}
},
"gateway": {
"mode": "local",
"bind": "auto"
},
"meta": {
"lastTouchedVersion": "2026.1.25"
}
}
EOF
```
**Note:** With `CLAWDBOT_STATE_DIR=/data`, the config path is `/data/clawdbot.json`.
**Note:** The Discord token can come from either:
- Environment variable: `DISCORD_BOT_TOKEN` (recommended for secrets)
- Config file: `channels.discord.token`
If using env var, no need to add token to config. The gateway reads `DISCORD_BOT_TOKEN` automatically.
Restart to apply:
```bash
exit
fly machine restart <machine-id>
```
## 6) Access the Gateway
### Control UI
Open in browser:
```bash
fly open
```
Or visit `https://my-clawdbot.fly.dev/`
Paste your gateway token (the one from `CLAWDBOT_GATEWAY_TOKEN`) to authenticate.
### Logs
```bash
fly logs # Live logs
fly logs --no-tail # Recent logs
```
### SSH Console
```bash
fly ssh console
```
## Troubleshooting
### "App is not listening on expected address"
The gateway is binding to `127.0.0.1` instead of `0.0.0.0`.
**Fix:** Add `--bind lan` to your process command in `fly.toml`.
### Health checks failing / connection refused
Fly can't reach the gateway on the configured port.
**Fix:** Ensure `internal_port` matches the gateway port (set `--port 3000` or `CLAWDBOT_GATEWAY_PORT=3000`).
### OOM / Memory Issues
Container keeps restarting or getting killed. Signs: `SIGABRT`, `v8::internal::Runtime_AllocateInYoungGeneration`, or silent restarts.
**Fix:** Increase memory in `fly.toml`:
```toml
[[vm]]
memory = "2048mb"
```
Or update an existing machine:
```bash
fly machine update <machine-id> --vm-memory 2048 -y
```
**Note:** 512MB is too small. 1GB may work but can OOM under load or with verbose logging. **2GB is recommended.**
### Gateway Lock Issues
Gateway refuses to start with "already running" errors.
This happens when the container restarts but the PID lock file persists on the volume.
**Fix:** Delete the lock file:
```bash
fly ssh console --command "rm -f /data/gateway.*.lock"
fly machine restart <machine-id>
```
The lock file is at `/data/gateway.*.lock` (not in a subdirectory).
### Config Not Being Read
If using `--allow-unconfigured`, the gateway creates a minimal config. Your custom config at `/data/clawdbot.json` should be read on restart.
Verify the config exists:
```bash
fly ssh console --command "cat /data/clawdbot.json"
```
### Writing Config via SSH
The `fly ssh console -C` command doesn't support shell redirection. To write a config file:
```bash
# Use echo + tee (pipe from local to remote)
echo '{"your":"config"}' | fly ssh console -C "tee /data/clawdbot.json"
# Or use sftp
fly sftp shell
> put /local/path/config.json /data/clawdbot.json
```
**Note:** `fly sftp` may fail if the file already exists. Delete first:
```bash
fly ssh console --command "rm /data/clawdbot.json"
```
### State Not Persisting
If you lose credentials or sessions after a restart, the state dir is writing to the container filesystem.
**Fix:** Ensure `CLAWDBOT_STATE_DIR=/data` is set in `fly.toml` and redeploy.
## Updates
```bash
# Pull latest changes
git pull
# Redeploy
fly deploy
# Check health
fly status
fly logs
```
### Updating Machine Command
If you need to change the startup command without a full redeploy:
```bash
# Get machine ID
fly machines list
# Update command
fly machine update <machine-id> --command "node dist/index.js gateway --port 3000 --bind lan" -y
# Or with memory increase
fly machine update <machine-id> --vm-memory 2048 --command "node dist/index.js gateway --port 3000 --bind lan" -y
```
**Note:** After `fly deploy`, the machine command may reset to what's in `fly.toml`. If you made manual changes, re-apply them after deploy.
## Private Deployment (Hardened)
By default, Fly allocates public IPs, making your gateway accessible at `https://your-app.fly.dev`. This is convenient but means your deployment is discoverable by internet scanners (Shodan, Censys, etc.).
For a hardened deployment with **no public exposure**, use the private template.
### When to use private deployment
- You only make **outbound** calls/messages (no inbound webhooks)
- You use **ngrok or Tailscale** tunnels for any webhook callbacks
- You access the gateway via **SSH, proxy, or WireGuard** instead of browser
- You want the deployment **hidden from internet scanners**
### Setup
Use `fly.private.toml` instead of the standard config:
```bash
# Deploy with private config
fly deploy -c fly.private.toml
```
Or convert an existing deployment:
```bash
# List current IPs
fly ips list -a my-clawdbot
# Release public IPs
fly ips release <public-ipv4> -a my-clawdbot
fly ips release <public-ipv6> -a my-clawdbot
# Allocate private-only IPv6
fly ips allocate-v6 --private -a my-clawdbot
```
After this, `fly ips list` should show only a `private` type IP:
```
VERSION IP TYPE REGION
v6 fdaa:x:x:x:x::x private global
```
### Accessing a private deployment
Since there's no public URL, use one of these methods:
**Option 1: Local proxy (simplest)**
```bash
# Forward local port 3000 to the app
fly proxy 3000:3000 -a my-clawdbot
# Then open http://localhost:3000 in browser
```
**Option 2: WireGuard VPN**
```bash
# Create WireGuard config (one-time)
fly wireguard create
# Import to WireGuard client, then access via internal IPv6
# Example: http://[fdaa:x:x:x:x::x]:3000
```
**Option 3: SSH only**
```bash
fly ssh console -a my-clawdbot
```
### Webhooks with private deployment
If you need webhook callbacks (Twilio, Telnyx, etc.) without public exposure:
1. **ngrok tunnel** - Run ngrok inside the container or as a sidecar
2. **Tailscale Funnel** - Expose specific paths via Tailscale
3. **Outbound-only** - Some providers (Twilio) work fine for outbound calls without webhooks
Example voice-call config with ngrok:
```json
{
"plugins": {
"entries": {
"voice-call": {
"enabled": true,
"config": {
"provider": "twilio",
"tunnel": { "provider": "ngrok" }
}
}
}
}
}
```
The ngrok tunnel runs inside the container and provides a public webhook URL without exposing the Fly app itself.
### Security benefits
| Aspect | Public | Private |
|--------|--------|---------|
| Internet scanners | Discoverable | Hidden |
| Direct attacks | Possible | Blocked |
| Control UI access | Browser | Proxy/VPN |
| Webhook delivery | Direct | Via tunnel |
## Notes
- Fly.io uses **x86 architecture** (not ARM)
- The Dockerfile is compatible with both architectures
- For WhatsApp/Telegram onboarding, use `fly ssh console`
- Persistent data lives on the volume at `/data`
- Signal requires Java + signal-cli; use a custom image and keep memory at 2GB+.
## Cost
With the recommended config (`shared-cpu-2x`, 2GB RAM):
- ~$10-15/month depending on usage
- Free tier includes some allowance
See [Fly.io pricing](https://fly.io/docs/about/pricing/) for details.