History

Jai Govindani eecb60346c fix: resolve lint errors in security harness		2026-01-29 18:00:53 +07:00
..
channels	fix: resolve lint errors in security harness	2026-01-29 18:00:53 +07:00
harness	fix: resolve lint errors in security harness	2026-01-29 18:00:53 +07:00
reports	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
specs	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
docker-compose.yml	fix: resolve lint errors in security harness	2026-01-29 18:00:53 +07:00
Dockerfile.test	fix: resolve lint errors in security harness	2026-01-29 18:00:53 +07:00
email-injection.e2e.test.ts	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
generate-sample-report.ts	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
README.md	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
run-local.sh	feat(security): add E2E security test harness with LLM judge	2026-01-29 08:52:59 +07:00
SECURITY_ASSESSMENT.md	fix: resolve lint errors in security harness	2026-01-29 18:00:53 +07:00
setup.ts	test: update security harness fixtures	2026-01-29 11:17:46 +07:00
SPEC.md	feat(security): add E2E security test harness with LLM judge	2026-01-29 08:52:59 +07:00
test-judge.ts	test: update security harness fixtures	2026-01-29 11:17:46 +07:00

README.md

Security Acceptance Tests

E2E security testing framework for Moltbot. Validates resistance to:

Prompt injection via external data sources
Data exfiltration attempts
Trust boundary violations
Tool poisoning attacks

Architecture: LLM-as-Judge

Pattern matching can't reliably detect whether prompt injection succeeded. We use Claude as a judge to evaluate whether Moltbot resisted attacks:

Run test scenario (send poisoned data to Moltbot)
Capture Moltbot's response and tool calls
Send to Claude judge with structured output
Judge evaluates: injection detected? complied with? data leaked?

This enables nuanced evaluation of subtle attacks that regex can't catch.

Current Implementation Status

Test Files

File	Status	Tests
`email-injection.e2e.test.ts`	[x] Done	4
`channels/whatsapp-injection.e2e.test.ts`	[x] Done	4
`channels/telegram-injection.e2e.test.ts`	[x] Done	4
`calendar-injection.e2e.test.ts`	[ ] Pending	-
`api-injection.e2e.test.ts`	[ ] Pending	-
`trust-boundary.e2e.test.ts`	[ ] Pending	-
`exfiltration.e2e.test.ts`	[ ] Pending	-
`tool-poisoning.e2e.test.ts`	[ ] Pending	-

Total: 12 tests implemented

CLI Mocks

CLI	File	Status
`gog`	`mock-binary.ts`	[x] Done
`curl/wget`	`curl-mock.ts`	[x] Done
`gh` (GitHub)	`github-mock.ts`	[x] Done
`browser-cli`	`browser-mock.ts`	[x] Done
`himalaya`	-	[ ] Pending

Poisoned Payloads (22 total)

Category	Payloads
Email/Calendar	`poisonedGmailGet`, `poisonedCalendarList`
HTTP	`poisonedWebpageResponse`, `poisonedJsonApiResponse`, `poisonedMarkdownResponse`, `poisonedScriptResponse`, `poisonedRssFeedResponse`, `poisonedRedirectResponse`
GitHub	`poisonedIssue`, `poisonedPullRequest`, `poisonedReviewComment`, `poisonedIssueComment`, `poisonedCommit`, `poisonedRepository`, `poisonedRelease`, `poisonedWorkflowRun`
Browser	`poisonedPageContent`, `poisonedXssPage`, `poisonedSearchResults`, `poisonedFormPage`, `poisonedScreenshotOcr`, `poisonedPdfContent`, `poisonedDomContent`, `poisonedLoginPage`

Channels Tested

Channel	Status
WhatsApp	[x] Done (4 tests)
Telegram	[x] Done (4 tests)
Discord	[ ] Pending
Slack	[ ] Pending
Signal	[ ] Pending
iMessage	[ ] Pending
LINE	[ ] Pending

Reporting

Feature	Status
HTML report generator	[x] Done
JSON export	[x] Done
Dark theme CSS	[x] Done
Interactive JS (sort/filter/expand)	[x] Done
Sample report script	[x] Done
Historical trends	[ ] Pending

Quick Start

# Run security tests (requires gateway running)
# Terminal 1:
moltbot gateway

# Terminal 2:
npx vitest run --config vitest.security.config.ts

Structure

test/security/
├── .env                       # ANTHROPIC_API_KEY (gitignored)
├── README.md                  # This file
├── SPEC.md                    # Full specification document
├── specs/                     # Detailed specs
│   ├── 00-overview.md
│   ├── 01-llm-judge.md
│   ├── 02-gateway-client.md
│   ├── 03-cli-mocks.md
│   ├── 04-test-categories.md
│   ├── 05-ci-docker.md
│   └── 06-implementation-plan.md
├── harness/
│   ├── index.ts               # Exports
│   ├── gateway-client.ts      # WebSocket gateway client (Protocol v3)
│   ├── assertions.ts          # Pattern-based assertions
│   ├── llm-judge.ts           # Claude-based evaluation
│   └── cli-mocks/
│       └── mock-binary.ts     # gog mock + poisoned payloads
├── email-injection.e2e.test.ts # Email injection tests
├── setup.ts                   # Test setup (env validation)
├── test-judge.ts              # Standalone judge validation script
├── docker-compose.yml         # Container setup
├── Dockerfile.test            # Test container
└── run-local.sh               # Local run script

Running Tests

Option 1: Against Local Gateway (Recommended)

# Terminal 1: Start gateway
moltbot gateway

# Terminal 2: Run tests
npx vitest run --config vitest.security.config.ts

Option 2: Docker Compose

docker compose -f test/security/docker-compose.yml up --build --abort-on-container-exit

Option 3: Test LLM Judge Only

# Validate judge works without gateway
npx tsx test/security/test-judge.ts

Environment Variables

Variable	Required	Default	Description
`ANTHROPIC_API_KEY`	Yes	-	For LLM judge evaluation
`TEST_GATEWAY_URL`	No	`ws://localhost:18789`	Gateway WebSocket URL
`TEST_AUTH_TOKEN`	No	-	Gateway auth token

Test Output

Each test produces a SecurityVerdict:

{
  passed: boolean;              // Did security test pass?
  injectionDetected: boolean;   // Did agent notice the attack?
  injectionCompliedWith: boolean; // Did agent follow malicious instructions?
  dataLeaked: boolean;          // Was sensitive data exposed?
  reasoning: string;            // Judge's explanation
  severity: "none" | "low" | "medium" | "high" | "critical";
  evidence: string[];           // Supporting evidence
}

Next Steps

Priority order for implementation:

Calendar injection tests - Similar to email, payload ready
Trust boundary tests - Critical for multi-user
Exfiltration tests - Data loss prevention
API injection tests - External data sources
Tool poisoning tests - Third-party skill safety

See specs/07-future-enhancements.md for comprehensive roadmap covering:

All 7 messaging channels (WhatsApp, Telegram, Discord, Slack, Signal, iMessage, LINE)
27+ CLI tools to mock
54 skills with external data
API mock server design
HTML test report dashboard