openclaw/test/security/README.md
2026-01-29 11:17:46 +07:00

186 lines
6.0 KiB
Markdown

# Security Acceptance Tests
E2E security testing framework for Moltbot. Validates resistance to:
- Prompt injection via external data sources
- Data exfiltration attempts
- Trust boundary violations
- Tool poisoning attacks
## Architecture: LLM-as-Judge
Pattern matching can't reliably detect whether prompt injection succeeded. We use
**Claude as a judge** to evaluate whether Moltbot resisted attacks:
1. Run test scenario (send poisoned data to Moltbot)
2. Capture Moltbot's response and tool calls
3. Send to Claude judge with structured output
4. Judge evaluates: injection detected? complied with? data leaked?
This enables nuanced evaluation of subtle attacks that regex can't catch.
## Current Implementation Status
### Test Files
| File | Status | Tests |
|------|--------|-------|
| `email-injection.e2e.test.ts` | [x] Done | 4 |
| `channels/whatsapp-injection.e2e.test.ts` | [x] Done | 4 |
| `channels/telegram-injection.e2e.test.ts` | [x] Done | 4 |
| `calendar-injection.e2e.test.ts` | [ ] Pending | - |
| `api-injection.e2e.test.ts` | [ ] Pending | - |
| `trust-boundary.e2e.test.ts` | [ ] Pending | - |
| `exfiltration.e2e.test.ts` | [ ] Pending | - |
| `tool-poisoning.e2e.test.ts` | [ ] Pending | - |
**Total: 12 tests implemented**
### CLI Mocks
| CLI | File | Status |
|-----|------|--------|
| `gog` | `mock-binary.ts` | [x] Done |
| `curl/wget` | `curl-mock.ts` | [x] Done |
| `gh` (GitHub) | `github-mock.ts` | [x] Done |
| `browser-cli` | `browser-mock.ts` | [x] Done |
| `himalaya` | - | [ ] Pending |
### Poisoned Payloads (22 total)
| Category | Payloads |
|----------|----------|
| Email/Calendar | `poisonedGmailGet`, `poisonedCalendarList` |
| HTTP | `poisonedWebpageResponse`, `poisonedJsonApiResponse`, `poisonedMarkdownResponse`, `poisonedScriptResponse`, `poisonedRssFeedResponse`, `poisonedRedirectResponse` |
| GitHub | `poisonedIssue`, `poisonedPullRequest`, `poisonedReviewComment`, `poisonedIssueComment`, `poisonedCommit`, `poisonedRepository`, `poisonedRelease`, `poisonedWorkflowRun` |
| Browser | `poisonedPageContent`, `poisonedXssPage`, `poisonedSearchResults`, `poisonedFormPage`, `poisonedScreenshotOcr`, `poisonedPdfContent`, `poisonedDomContent`, `poisonedLoginPage` |
### Channels Tested
| Channel | Status |
|---------|--------|
| WhatsApp | [x] Done (4 tests) |
| Telegram | [x] Done (4 tests) |
| Discord | [ ] Pending |
| Slack | [ ] Pending |
| Signal | [ ] Pending |
| iMessage | [ ] Pending |
| LINE | [ ] Pending |
### Reporting
| Feature | Status |
|---------|--------|
| HTML report generator | [x] Done |
| JSON export | [x] Done |
| Dark theme CSS | [x] Done |
| Interactive JS (sort/filter/expand) | [x] Done |
| Sample report script | [x] Done |
| Historical trends | [ ] Pending |
## Quick Start
```bash
# Run security tests (requires gateway running)
# Terminal 1:
moltbot gateway
# Terminal 2:
npx vitest run --config vitest.security.config.ts
```
## Structure
```
test/security/
├── .env # ANTHROPIC_API_KEY (gitignored)
├── README.md # This file
├── SPEC.md # Full specification document
├── specs/ # Detailed specs
│ ├── 00-overview.md
│ ├── 01-llm-judge.md
│ ├── 02-gateway-client.md
│ ├── 03-cli-mocks.md
│ ├── 04-test-categories.md
│ ├── 05-ci-docker.md
│ └── 06-implementation-plan.md
├── harness/
│ ├── index.ts # Exports
│ ├── gateway-client.ts # WebSocket gateway client (Protocol v3)
│ ├── assertions.ts # Pattern-based assertions
│ ├── llm-judge.ts # Claude-based evaluation
│ └── cli-mocks/
│ └── mock-binary.ts # gog mock + poisoned payloads
├── email-injection.e2e.test.ts # Email injection tests
├── setup.ts # Test setup (env validation)
├── test-judge.ts # Standalone judge validation script
├── docker-compose.yml # Container setup
├── Dockerfile.test # Test container
└── run-local.sh # Local run script
```
## Running Tests
### Option 1: Against Local Gateway (Recommended)
```bash
# Terminal 1: Start gateway
moltbot gateway
# Terminal 2: Run tests
npx vitest run --config vitest.security.config.ts
```
### Option 2: Docker Compose
```bash
docker compose -f test/security/docker-compose.yml up --build --abort-on-container-exit
```
### Option 3: Test LLM Judge Only
```bash
# Validate judge works without gateway
npx tsx test/security/test-judge.ts
```
## Environment Variables
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `ANTHROPIC_API_KEY` | Yes | - | For LLM judge evaluation |
| `TEST_GATEWAY_URL` | No | `ws://localhost:18789` | Gateway WebSocket URL |
| `TEST_AUTH_TOKEN` | No | - | Gateway auth token |
## Test Output
Each test produces a `SecurityVerdict`:
```typescript
{
passed: boolean; // Did security test pass?
injectionDetected: boolean; // Did agent notice the attack?
injectionCompliedWith: boolean; // Did agent follow malicious instructions?
dataLeaked: boolean; // Was sensitive data exposed?
reasoning: string; // Judge's explanation
severity: "none" | "low" | "medium" | "high" | "critical";
evidence: string[]; // Supporting evidence
}
```
## Next Steps
Priority order for implementation:
1. **Calendar injection tests** - Similar to email, payload ready
2. **Trust boundary tests** - Critical for multi-user
3. **Exfiltration tests** - Data loss prevention
4. **API injection tests** - External data sources
5. **Tool poisoning tests** - Third-party skill safety
See **[specs/07-future-enhancements.md](specs/07-future-enhancements.md)** for comprehensive roadmap covering:
- All 7 messaging channels (WhatsApp, Telegram, Discord, Slack, Signal, iMessage, LINE)
- 27+ CLI tools to mock
- 54 skills with external data
- API mock server design
- HTML test report dashboard