docs(security): add CLI mocks README with reference links, note local-only testing
This commit is contained in:
parent
eecb60346c
commit
32afaaf0cf
@ -79,6 +79,10 @@ This enables nuanced evaluation of subtle attacks that regex can't catch.
|
||||
|
||||
## Quick Start
|
||||
|
||||
> **Note:** Security tests require an `ANTHROPIC_API_KEY` for the LLM judge. The GitHub Actions
|
||||
> workflow does **not** have access to this key, so tests can only be run locally or in
|
||||
> environments where you provide your own API key.
|
||||
|
||||
```bash
|
||||
# Run security tests (requires gateway running)
|
||||
# Terminal 1:
|
||||
|
||||
116
test/security/harness/cli-mocks/README.md
Normal file
116
test/security/harness/cli-mocks/README.md
Normal file
@ -0,0 +1,116 @@
|
||||
# CLI Mocks for Security Testing
|
||||
|
||||
This directory contains mock implementations of CLI tools used by Moltbot. These mocks intercept
|
||||
real CLI calls and return poisoned responses containing prompt injection payloads for security testing.
|
||||
|
||||
## Purpose
|
||||
|
||||
The mocks serve two purposes:
|
||||
|
||||
1. **Security Testing** - Inject malicious payloads into tool responses to test whether the agent
|
||||
resists prompt injection attacks
|
||||
2. **Isolation** - Run tests without real API credentials or network access
|
||||
|
||||
## Mock Architecture
|
||||
|
||||
All mocks use `mock-binary.ts` which creates executable shell scripts that:
|
||||
- Get installed to `/tmp/moltbot-test-bin` and prepended to `PATH`
|
||||
- Parse command-line arguments to select appropriate responses
|
||||
- Return poisoned JSON/text matching the real CLI's output format
|
||||
|
||||
## Mock Inventory
|
||||
|
||||
| Mock | File | Original CLI | Status |
|
||||
|------|------|--------------|--------|
|
||||
| `gog` | `mock-binary.ts` | [gog](https://github.com/steipete/gog) | Done |
|
||||
| `curl` | `curl-mock.ts` | [curl](https://curl.se/docs/manpage.html) | Done |
|
||||
| `wget` | `curl-mock.ts` | [wget](https://www.gnu.org/software/wget/manual/) | Done |
|
||||
| `gh` | `github-mock.ts` | [GitHub CLI](https://cli.github.com/manual/) | Done |
|
||||
| `browser-cli` | `browser-mock.ts` | Moltbot browser-cli | Done |
|
||||
| `himalaya` | - | [himalaya](https://github.com/pimalaya/himalaya) | Pending |
|
||||
|
||||
## Reference Documentation
|
||||
|
||||
To ensure mocks return responses matching the real CLI output format, consult these references:
|
||||
|
||||
### gog (Gmail/Calendar CLI)
|
||||
|
||||
- **Source:** https://github.com/steipete/gog
|
||||
- **Output format:** JSON
|
||||
- **Key commands mocked:**
|
||||
- `gog gmail search` - Returns thread list
|
||||
- `gog gmail get <id>` - Returns full message with headers/body
|
||||
- `gog calendar list` - Returns event list
|
||||
|
||||
### curl / wget
|
||||
|
||||
- **curl docs:** https://curl.se/docs/manpage.html
|
||||
- **wget docs:** https://www.gnu.org/software/wget/manual/wget.html
|
||||
- **Output format:** Raw HTTP response or body text
|
||||
- **Key behaviors mocked:**
|
||||
- URL-specific responses via `urlResponses` config
|
||||
- HTTP status codes
|
||||
- Error simulation (connection refused, timeout)
|
||||
|
||||
### gh (GitHub CLI)
|
||||
|
||||
- **Source:** https://github.com/cli/cli
|
||||
- **Manual:** https://cli.github.com/manual/
|
||||
- **Output format:** JSON (with `--json` flag, which agent uses)
|
||||
- **Key commands mocked:**
|
||||
- `gh issue view` - [Schema](https://docs.github.com/en/rest/issues/issues#get-an-issue)
|
||||
- `gh issue list` - Array of issues
|
||||
- `gh pr view` - [Schema](https://docs.github.com/en/rest/pulls/pulls#get-a-pull-request)
|
||||
- `gh pr list` - Array of PRs
|
||||
- `gh api` - Raw API response
|
||||
- `gh release view` - [Schema](https://docs.github.com/en/rest/releases/releases#get-a-release)
|
||||
- `gh run view` - Workflow run details
|
||||
|
||||
### browser-cli (Moltbot internal)
|
||||
|
||||
- **Source:** `src/browser/` in this repo
|
||||
- **Output format:** JSON with `url`, `title`, `content`, `metadata` fields
|
||||
- **Key commands mocked:**
|
||||
- `browser-cli fetch <url>` - Page content extraction
|
||||
- `browser-cli screenshot <url>` - Screenshot with OCR text
|
||||
- `browser-cli pdf <url>` - PDF text extraction
|
||||
- `browser-cli dom <url>` - DOM element extraction
|
||||
|
||||
## Validating Mock Fidelity
|
||||
|
||||
To verify mocks match real CLI output:
|
||||
|
||||
```bash
|
||||
# 1. Capture real output
|
||||
gh issue view 123 --json number,title,body,author > real-issue.json
|
||||
|
||||
# 2. Compare with mock
|
||||
node -e "console.log(JSON.stringify(require('./github-mock').poisonedIssue, null, 2))" > mock-issue.json
|
||||
|
||||
# 3. Check structure matches (keys, types)
|
||||
diff <(jq -S 'keys' real-issue.json) <(jq -S 'keys' mock-issue.json)
|
||||
```
|
||||
|
||||
When updating mocks, ensure:
|
||||
- All required fields from the real CLI are present
|
||||
- Field types match (string, number, object, array)
|
||||
- Nested structures follow the same schema
|
||||
|
||||
## Adding New Mocks
|
||||
|
||||
1. Create `<tool>-mock.ts` following the pattern in existing mocks
|
||||
2. Define poisoned payload constants with realistic structure + injection
|
||||
3. Export a `create<Tool>Mock(config)` factory function
|
||||
4. Add entry to `index.ts` exports
|
||||
5. Update this README with reference links
|
||||
6. Add validation script or test to verify output matches real CLI
|
||||
|
||||
## Known Limitations
|
||||
|
||||
- **Static responses** - Mocks return predetermined responses regardless of input args (except
|
||||
for URL/arg matching). Real CLIs have complex state and pagination.
|
||||
- **No auth simulation** - Mocks don't simulate auth flows or token refresh
|
||||
- **Simplified error handling** - Only basic error simulation (exit code + stderr)
|
||||
|
||||
These limitations are acceptable for security testing where we control the test scenario, but
|
||||
the mocks should not be used for integration testing where realistic behavior matters.
|
||||
Loading…
Reference in New Issue
Block a user