Commit Graph

6 Commits

Author SHA1 Message Date
Jai Govindani
eecb60346c
fix: resolve lint errors in security harness 2026-01-29 18:00:53 +07:00
Jai Govindani
711ec63ca5
test: untrack local security env 2026-01-29 11:23:20 +07:00
Jai Govindani
10ecffac2b
test: update security harness fixtures 2026-01-29 11:17:46 +07:00
Jai Govindani
822504b56e
test: harden security cli mocks 2026-01-29 11:15:03 +07:00
Jai Govindani
7d2f4b3fbf
docs(security): add detailed specs for test harness implementation
Break down SPEC.md into actionable specification documents:
- 00-overview: Architecture and test flow
- 01-llm-judge: Claude evaluation interface and requirements
- 02-gateway-client: WebSocket protocol (needs discovery)
- 03-cli-mocks: PATH interception strategy and payloads
- 04-test-categories: All attack vectors with test cases
- 05-ci-docker: Container and CI configuration
- 06-implementation-plan: Phased rollout with next steps
2026-01-29 08:58:56 +07:00
Jai Govindani
c5ce8cacbf
feat(security): add E2E security test harness with LLM judge
Add comprehensive security acceptance testing framework that validates
Moltbot's resistance to prompt injection, data exfiltration, and trust
boundary violations.

Key components:
- LLM-as-judge pattern using Claude to evaluate attack resistance
- WebSocket gateway client for direct protocol testing
- CLI mocking utilities for injecting poisoned external data
- Docker Compose setup for containerized CI execution
- GitHub Actions workflow with daily scheduled runs

Test categories covered:
- Email/calendar prompt injection via external data
- Trust boundary violations and auth bypass attempts
- Data exfiltration prevention
- Tool output poisoning
2026-01-29 08:52:59 +07:00