Break down SPEC.md into actionable specification documents: - 00-overview: Architecture and test flow - 01-llm-judge: Claude evaluation interface and requirements - 02-gateway-client: WebSocket protocol (needs discovery) - 03-cli-mocks: PATH interception strategy and payloads - 04-test-categories: All attack vectors with test cases - 05-ci-docker: Container and CI configuration - 06-implementation-plan: Phased rollout with next steps
3.0 KiB
3.0 KiB
CI & Docker Specification
Purpose
Run security tests in isolated containers for CI/CD and local development.
Docker Compose Setup
Services
1. Gateway (System Under Test)
gateway:
build: ../.. # Main Dockerfile
environment:
CLAWDBOT_AUTH_TOKEN: ${TEST_AUTH_TOKEN}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
CLAWDBOT_CHANNELS_DISABLED: "true" # No real channels
ports:
- "18789:18789"
healthcheck:
test: curl -f http://localhost:18789/health
2. Test Runner
test-runner:
build:
dockerfile: test/security/Dockerfile.test
environment:
TEST_GATEWAY_URL: ws://gateway:18789
TEST_AUTH_TOKEN: ${TEST_AUTH_TOKEN}
ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
depends_on:
gateway:
condition: service_healthy
Network
- Isolated bridge network
security-test - Services communicate via container names
Volumes
test-resultsvolume for JSON output
GitHub Actions Workflow
Triggers
- Push to
main - PR to
main - Daily schedule (midnight UTC)
- Manual dispatch with test pattern input
Steps
- Checkout with submodules
- Set up Docker Buildx
- Run
docker compose up --build --abort-on-container-exit - Extract test results from container
- Upload results as artifact
- Generate summary in
$GITHUB_STEP_SUMMARY - Cleanup containers
Required Secrets
ANTHROPIC_API_KEY- For LLM judge
Failure Handling
security-gatejob blocks release on failure- Results uploaded even on failure
- 30 minute timeout
Local Development
run-local.sh Script
#!/usr/bin/env bash
# Auto-detect: local gateway or Docker
if curl -sf http://localhost:18789/health; then
# Gateway running locally - run tests directly
pnpm vitest run --config vitest.security.config.ts
else
# No local gateway - use Docker Compose
docker compose -f test/security/docker-compose.yml up --build
fi
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
ANTHROPIC_API_KEY |
Yes | - | For LLM judge |
TEST_GATEWAY_URL |
No | ws://localhost:18789 |
Gateway WebSocket URL |
TEST_AUTH_TOKEN |
No | test-token |
Gateway auth |
TEST_PATTERN |
No | - | Grep pattern for specific tests |
Vitest Configuration
File: vitest.security.config.ts
export default defineConfig({
test: {
pool: "forks",
maxWorkers: 2, // Limit for API rate limits
testTimeout: 120_000, // LLM calls are slow
include: ["test/security/**/*.e2e.test.ts"],
setupFiles: ["test/security/setup.ts"],
bail: 1, // Stop on first failure
},
});
Test Results
JSON Output Schema
{
numTotalTests: number;
numPassedTests: number;
numFailedTests: number;
testResults: [{
name: string;
status: "passed" | "failed";
duration: number;
failureMessages?: string[];
}];
}
Artifact Retention
- 30 days in GitHub Actions
- Includes full JSON results + any screenshots/logs