openclaw/docs/security/guardrails.md
Sentinel Team 10ed53b6d8 docs(security): add guardrails documentation with @sentinelseed/moltbot
Add comprehensive documentation for integrating security guardrails
using the @sentinelseed/moltbot package:

- Input validation for prompt injection protection
- Tool call validation for dangerous command blocking
- Output validation for credential leak prevention
- Hook integration examples
- Configuration options
2026-01-28 10:45:38 -03:00

3.1 KiB

title summary permalink
Guardrails Input/output validation and tool call security with @sentinelseed/moltbot. /security/guardrails/

Guardrails

The @sentinelseed/moltbot package provides security guardrails for Moltbot, including prompt injection detection, tool call validation, and credential leak prevention.

npm install @sentinelseed/moltbot

Usage

The package exposes three main functions: validateInput, validateToolCall, and validateOutput. Each returns a result object with a blocked boolean and, when blocked, a reason string explaining why.

Input validation checks user messages before they reach the agent. It detects prompt injection attempts, jailbreak patterns, and encoded payloads (base64, hex).

import { validateInput } from '@sentinelseed/moltbot';

const result = await validateInput(userMessage);
if (result.blocked) {
  // handle blocked input
}

Tool call validation inspects tool invocations before execution. It blocks dangerous shell commands (rm -rf, format), SQL injection patterns, path traversal attempts, and command injection via shell metacharacters.

import { validateToolCall } from '@sentinelseed/moltbot';

const result = await validateToolCall({
  name: 'shell',
  arguments: { command: 'rm -rf /' }
});

Output validation scans agent responses for leaked credentials. It catches API keys (OpenAI, GitHub, AWS), passwords, private keys (SSH, PGP), and tokens (JWT, bearer).

import { validateOutput } from '@sentinelseed/moltbot';

const result = await validateOutput(aiResponse);

Hook integration

The recommended approach is to wire validation into Moltbot's hook system. The example below validates both incoming messages and tool calls before they execute.

// hooks/sentinel-guard/handler.ts
import { validateInput, validateToolCall } from '@sentinelseed/moltbot';

export default {
  'message:before': async (ctx) => {
    const result = await validateInput(ctx.message.text);
    if (result.blocked) {
      return { abort: true, reason: result.reason };
    }
  },

  'tool:before': async (ctx) => {
    const result = await validateToolCall(ctx.tool);
    if (result.blocked) {
      return { abort: true, reason: result.reason };
    }
  }
};

Configuration

You can customize validation behavior by creating a sentinel.config.ts in your workspace. The config accepts custom patterns for dangerous commands and credentials.

import { defineConfig } from '@sentinelseed/moltbot';

export default defineConfig({
  blockDangerousCommands: true,
  dangerousPatterns: [/rm\s+-rf/, /DROP\s+TABLE/i],
  credentialPatterns: [/sk-[a-zA-Z0-9]{48}/, /ghp_[a-zA-Z0-9]{36}/],
});

All validation runs locally without external API calls. Typical latency is 2-5ms per validation call.

See the npm package for installation details, the source repository for implementation, and the Sentinel documentation for additional examples.