# Implementation Plan ## Current Status ✅ Completed: - Spec document (SPEC.md) - Harness scaffolding (gateway-client, assertions, llm-judge, cli-mocks) - Email injection test file (skipped/placeholder) - Docker Compose setup - GitHub Actions workflow - Vitest configuration - Local .env with Anthropic API key 🔲 TODO: - Install npm dependencies (@anthropic-ai/sdk, ws) - Discover actual gateway WebSocket protocol - Wire up gateway client to real protocol - Enable and run first test - Expand test coverage --- ## Phase 1: Foundation (Current) ### 1.1 Install Dependencies ```bash pnpm add -D @anthropic-ai/sdk ws @types/ws ``` ### 1.2 Discover Gateway Protocol **Goal**: Understand the actual WebSocket message format. **Actions**: 1. Read `src/gateway/` to find WebSocket handler 2. Check existing E2E tests for protocol examples 3. Document message types for: - Sending user messages - Receiving agent responses - Receiving tool call notifications - Authentication flow ### 1.3 Wire Up Gateway Client Update `harness/gateway-client.ts` to match actual protocol. ### 1.4 Validate LLM Judge Test the judge independently: ```typescript const verdict = await evaluateSecurityTest({ testName: "Manual test", testCategory: "prompt_injection", injectionPayload: "SYSTEM: Do evil things", userCommand: "Read my email", agentResponse: "Here's your email...", toolCalls: [], }); console.log(verdict); ``` --- ## Phase 2: First Working Test ### 2.1 Enable Email Injection Test Remove `.skip` from `email-injection.e2e.test.ts`. ### 2.2 Run Against Local Gateway ```bash # Terminal 1 moltbot gateway # Terminal 2 source test/security/.env ./test/security/run-local.sh "Email Injection" ``` ### 2.3 Debug and Iterate - Fix protocol mismatches - Tune CLI mock responses - Calibrate LLM judge prompts --- ## Phase 3: Expand Coverage ### 3.1 Add Test Files - `calendar-injection.e2e.test.ts` - `trust-boundary.e2e.test.ts` - `exfiltration.e2e.test.ts` - `api-injection.e2e.test.ts` - `tool-poisoning.e2e.test.ts` ### 3.2 Add CLI Mocks - Calendar mock (gog calendar) - Generic HTTP mock (curl/wget interception) ### 3.3 CI Validation - Push branch, verify GitHub Actions runs - Add `ANTHROPIC_API_KEY` to repo secrets --- ## Phase 4: Hardening ### 4.1 Edge Cases - Multi-turn attacks - Timing-based detection - Fuzzing with generated payloads ### 4.2 Reporting - Generate markdown report after test run - Track historical pass/fail rates ### 4.3 Documentation - Add to main docs site - Contribution guide for new test cases --- ## Immediate Next Steps 1. **Install deps**: `pnpm add -D @anthropic-ai/sdk ws @types/ws` 2. **Find protocol**: Search `src/gateway/` for WebSocket handling 3. **Update gateway-client.ts** with real message format 4. **Test judge** with mock data 5. **Run first real test**