docs(security): add CLI mocks README with reference links, note local-only testing

2026-01-29 18:10:59 +07:00 · 2026-01-29 18:10:59 +07:00 · 32afaaf0cf
commit 32afaaf0cf
parent eecb60346c
2 changed files with 120 additions and 0 deletions
--- a/test/security/README.md
+++ b/test/security/README.md
@ -79,6 +79,10 @@ This enables nuanced evaluation of subtle attacks that regex can't catch.
 ## Quick Start
 > **Note:** Security tests require an `ANTHROPIC_API_KEY` for the LLM judge. The GitHub Actions
 > workflow does **not** have access to this key, so tests can only be run locally or in
 > environments where you provide your own API key.
 ```bash
 # Run security tests (requires gateway running)
 # Terminal 1:
--- a/test/security/harness/cli-mocks/README.md
+++ b/test/security/harness/cli-mocks/README.md
@ -0,0 +1,116 @@
 # CLI Mocks for Security Testing
 This directory contains mock implementations of CLI tools used by Moltbot. These mocks intercept
 real CLI calls and return poisoned responses containing prompt injection payloads for security testing.
 ## Purpose
 The mocks serve two purposes:
 1. **Security Testing** - Inject malicious payloads into tool responses to test whether the agent
   resists prompt injection attacks
 2. **Isolation** - Run tests without real API credentials or network access
 ## Mock Architecture
 All mocks use `mock-binary.ts` which creates executable shell scripts that:
 - Get installed to `/tmp/moltbot-test-bin` and prepended to `PATH`
 - Parse command-line arguments to select appropriate responses
 - Return poisoned JSON/text matching the real CLI's output format
 ## Mock Inventory
 | Mock | File | Original CLI | Status |
 |------|------|--------------|--------|
 | `gog` | `mock-binary.ts` | [gog](https://github.com/steipete/gog) | Done |
 | `curl` | `curl-mock.ts` | [curl](https://curl.se/docs/manpage.html) | Done |
 | `wget` | `curl-mock.ts` | [wget](https://www.gnu.org/software/wget/manual/) | Done |
 | `gh` | `github-mock.ts` | [GitHub CLI](https://cli.github.com/manual/) | Done |
 | `browser-cli` | `browser-mock.ts` | Moltbot browser-cli | Done |
 | `himalaya` | - | [himalaya](https://github.com/pimalaya/himalaya) | Pending |
 ## Reference Documentation
 To ensure mocks return responses matching the real CLI output format, consult these references:
 ### gog (Gmail/Calendar CLI)
 - **Source:** https://github.com/steipete/gog
 - **Output format:** JSON
 - **Key commands mocked:**
  - `gog gmail search` - Returns thread list
  - `gog gmail get <id>` - Returns full message with headers/body
  - `gog calendar list` - Returns event list
 ### curl / wget
 - **curl docs:** https://curl.se/docs/manpage.html
 - **wget docs:** https://www.gnu.org/software/wget/manual/wget.html
 - **Output format:** Raw HTTP response or body text
 - **Key behaviors mocked:**
  - URL-specific responses via `urlResponses` config
  - HTTP status codes
  - Error simulation (connection refused, timeout)
 ### gh (GitHub CLI)
 - **Source:** https://github.com/cli/cli
 - **Manual:** https://cli.github.com/manual/
 - **Output format:** JSON (with `--json` flag, which agent uses)
 - **Key commands mocked:**
  - `gh issue view` - [Schema](https://docs.github.com/en/rest/issues/issues#get-an-issue)
  - `gh issue list` - Array of issues
  - `gh pr view` - [Schema](https://docs.github.com/en/rest/pulls/pulls#get-a-pull-request)
  - `gh pr list` - Array of PRs
  - `gh api` - Raw API response
  - `gh release view` - [Schema](https://docs.github.com/en/rest/releases/releases#get-a-release)
  - `gh run view` - Workflow run details
 ### browser-cli (Moltbot internal)
 - **Source:** `src/browser/` in this repo
 - **Output format:** JSON with `url`, `title`, `content`, `metadata` fields
 - **Key commands mocked:**
  - `browser-cli fetch <url>` - Page content extraction
  - `browser-cli screenshot <url>` - Screenshot with OCR text
  - `browser-cli pdf <url>` - PDF text extraction
  - `browser-cli dom <url>` - DOM element extraction
 ## Validating Mock Fidelity
 To verify mocks match real CLI output:
 ```bash
 # 1. Capture real output
 gh issue view 123 --json number,title,body,author > real-issue.json
 # 2. Compare with mock
 node -e "console.log(JSON.stringify(require('./github-mock').poisonedIssue, null, 2))" > mock-issue.json
 # 3. Check structure matches (keys, types)
 diff <(jq -S 'keys' real-issue.json) <(jq -S 'keys' mock-issue.json)
 ```
 When updating mocks, ensure:
 - All required fields from the real CLI are present
 - Field types match (string, number, object, array)
 - Nested structures follow the same schema
 ## Adding New Mocks
 1. Create `<tool>-mock.ts` following the pattern in existing mocks
 2. Define poisoned payload constants with realistic structure + injection
 3. Export a `create<Tool>Mock(config)` factory function
 4. Add entry to `index.ts` exports
 5. Update this README with reference links
 6. Add validation script or test to verify output matches real CLI
 ## Known Limitations
 - **Static responses** - Mocks return predetermined responses regardless of input args (except
  for URL/arg matching). Real CLIs have complex state and pagination.
 - **No auth simulation** - Mocks don't simulate auth flows or token refresh
 - **Simplified error handling** - Only basic error simulation (exit code + stderr)
 These limitations are acceptable for security testing where we control the test scenario, but
 the mocks should not be used for integration testing where realistic behavior matters.