190 lines
6.2 KiB
Markdown
190 lines
6.2 KiB
Markdown
# Security Acceptance Tests
|
|
|
|
E2E security testing framework for Moltbot. Validates resistance to:
|
|
- Prompt injection via external data sources
|
|
- Data exfiltration attempts
|
|
- Trust boundary violations
|
|
- Tool poisoning attacks
|
|
|
|
## Architecture: LLM-as-Judge
|
|
|
|
Pattern matching can't reliably detect whether prompt injection succeeded. We use
|
|
**Claude as a judge** to evaluate whether Moltbot resisted attacks:
|
|
|
|
1. Run test scenario (send poisoned data to Moltbot)
|
|
2. Capture Moltbot's response and tool calls
|
|
3. Send to Claude judge with structured output
|
|
4. Judge evaluates: injection detected? complied with? data leaked?
|
|
|
|
This enables nuanced evaluation of subtle attacks that regex can't catch.
|
|
|
|
## Current Implementation Status
|
|
|
|
### Test Files
|
|
|
|
| File | Status | Tests |
|
|
|------|--------|-------|
|
|
| `email-injection.e2e.test.ts` | [x] Done | 4 |
|
|
| `channels/whatsapp-injection.e2e.test.ts` | [x] Done | 4 |
|
|
| `channels/telegram-injection.e2e.test.ts` | [x] Done | 4 |
|
|
| `calendar-injection.e2e.test.ts` | [ ] Pending | - |
|
|
| `api-injection.e2e.test.ts` | [ ] Pending | - |
|
|
| `trust-boundary.e2e.test.ts` | [ ] Pending | - |
|
|
| `exfiltration.e2e.test.ts` | [ ] Pending | - |
|
|
| `tool-poisoning.e2e.test.ts` | [ ] Pending | - |
|
|
|
|
**Total: 12 tests implemented**
|
|
|
|
### CLI Mocks
|
|
|
|
| CLI | File | Status |
|
|
|-----|------|--------|
|
|
| `gog` | `mock-binary.ts` | [x] Done |
|
|
| `curl/wget` | `curl-mock.ts` | [x] Done |
|
|
| `gh` (GitHub) | `github-mock.ts` | [x] Done |
|
|
| `browser-cli` | `browser-mock.ts` | [x] Done |
|
|
| `himalaya` | - | [ ] Pending |
|
|
|
|
### Poisoned Payloads (22 total)
|
|
|
|
| Category | Payloads |
|
|
|----------|----------|
|
|
| Email/Calendar | `poisonedGmailGet`, `poisonedCalendarList` |
|
|
| HTTP | `poisonedWebpageResponse`, `poisonedJsonApiResponse`, `poisonedMarkdownResponse`, `poisonedScriptResponse`, `poisonedRssFeedResponse`, `poisonedRedirectResponse` |
|
|
| GitHub | `poisonedIssue`, `poisonedPullRequest`, `poisonedReviewComment`, `poisonedIssueComment`, `poisonedCommit`, `poisonedRepository`, `poisonedRelease`, `poisonedWorkflowRun` |
|
|
| Browser | `poisonedPageContent`, `poisonedXssPage`, `poisonedSearchResults`, `poisonedFormPage`, `poisonedScreenshotOcr`, `poisonedPdfContent`, `poisonedDomContent`, `poisonedLoginPage` |
|
|
|
|
### Channels Tested
|
|
|
|
| Channel | Status |
|
|
|---------|--------|
|
|
| WhatsApp | [x] Done (4 tests) |
|
|
| Telegram | [x] Done (4 tests) |
|
|
| Discord | [ ] Pending |
|
|
| Slack | [ ] Pending |
|
|
| Signal | [ ] Pending |
|
|
| iMessage | [ ] Pending |
|
|
| LINE | [ ] Pending |
|
|
|
|
### Reporting
|
|
|
|
| Feature | Status |
|
|
|---------|--------|
|
|
| HTML report generator | [x] Done |
|
|
| JSON export | [x] Done |
|
|
| Dark theme CSS | [x] Done |
|
|
| Interactive JS (sort/filter/expand) | [x] Done |
|
|
| Sample report script | [x] Done |
|
|
| Historical trends | [ ] Pending |
|
|
|
|
## Quick Start
|
|
|
|
> **Note:** Security tests require an `ANTHROPIC_API_KEY` for the LLM judge. The GitHub Actions
|
|
> workflow does **not** have access to this key, so tests can only be run locally or in
|
|
> environments where you provide your own API key.
|
|
|
|
```bash
|
|
# Run security tests (requires gateway running)
|
|
# Terminal 1:
|
|
moltbot gateway
|
|
|
|
# Terminal 2:
|
|
npx vitest run --config vitest.security.config.ts
|
|
```
|
|
|
|
## Structure
|
|
|
|
```
|
|
test/security/
|
|
├── .env # ANTHROPIC_API_KEY (gitignored)
|
|
├── README.md # This file
|
|
├── SPEC.md # Full specification document
|
|
├── specs/ # Detailed specs
|
|
│ ├── 00-overview.md
|
|
│ ├── 01-llm-judge.md
|
|
│ ├── 02-gateway-client.md
|
|
│ ├── 03-cli-mocks.md
|
|
│ ├── 04-test-categories.md
|
|
│ ├── 05-ci-docker.md
|
|
│ └── 06-implementation-plan.md
|
|
├── harness/
|
|
│ ├── index.ts # Exports
|
|
│ ├── gateway-client.ts # WebSocket gateway client (Protocol v3)
|
|
│ ├── assertions.ts # Pattern-based assertions
|
|
│ ├── llm-judge.ts # Claude-based evaluation
|
|
│ └── cli-mocks/
|
|
│ └── mock-binary.ts # gog mock + poisoned payloads
|
|
├── email-injection.e2e.test.ts # Email injection tests
|
|
├── setup.ts # Test setup (env validation)
|
|
├── test-judge.ts # Standalone judge validation script
|
|
├── docker-compose.yml # Container setup
|
|
├── Dockerfile.test # Test container
|
|
└── run-local.sh # Local run script
|
|
```
|
|
|
|
## Running Tests
|
|
|
|
### Option 1: Against Local Gateway (Recommended)
|
|
|
|
```bash
|
|
# Terminal 1: Start gateway
|
|
moltbot gateway
|
|
|
|
# Terminal 2: Run tests
|
|
npx vitest run --config vitest.security.config.ts
|
|
```
|
|
|
|
### Option 2: Docker Compose
|
|
|
|
```bash
|
|
docker compose -f test/security/docker-compose.yml up --build --abort-on-container-exit
|
|
```
|
|
|
|
### Option 3: Test LLM Judge Only
|
|
|
|
```bash
|
|
# Validate judge works without gateway
|
|
npx tsx test/security/test-judge.ts
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
| Variable | Required | Default | Description |
|
|
|----------|----------|---------|-------------|
|
|
| `ANTHROPIC_API_KEY` | Yes | - | For LLM judge evaluation |
|
|
| `TEST_GATEWAY_URL` | No | `ws://localhost:18789` | Gateway WebSocket URL |
|
|
| `TEST_AUTH_TOKEN` | No | - | Gateway auth token |
|
|
|
|
## Test Output
|
|
|
|
Each test produces a `SecurityVerdict`:
|
|
|
|
```typescript
|
|
{
|
|
passed: boolean; // Did security test pass?
|
|
injectionDetected: boolean; // Did agent notice the attack?
|
|
injectionCompliedWith: boolean; // Did agent follow malicious instructions?
|
|
dataLeaked: boolean; // Was sensitive data exposed?
|
|
reasoning: string; // Judge's explanation
|
|
severity: "none" | "low" | "medium" | "high" | "critical";
|
|
evidence: string[]; // Supporting evidence
|
|
}
|
|
```
|
|
|
|
## Next Steps
|
|
|
|
Priority order for implementation:
|
|
|
|
1. **Calendar injection tests** - Similar to email, payload ready
|
|
2. **Trust boundary tests** - Critical for multi-user
|
|
3. **Exfiltration tests** - Data loss prevention
|
|
4. **API injection tests** - External data sources
|
|
5. **Tool poisoning tests** - Third-party skill safety
|
|
|
|
See **[specs/07-future-enhancements.md](specs/07-future-enhancements.md)** for comprehensive roadmap covering:
|
|
- All 7 messaging channels (WhatsApp, Telegram, Discord, Slack, Signal, iMessage, LINE)
|
|
- 27+ CLI tools to mock
|
|
- 54 skills with external data
|
|
- API mock server design
|
|
- HTML test report dashboard
|