Commit Graph

8333 Commits

Author SHA1 Message Date
valtterimelkko
cb742ebb27 Chore: Add .claude/settings.json to gitignore and remove from tracking
**Changes:**
- Added `.claude/` directory to .gitignore (Claude Code local settings)
- Removed `.claude/settings.json` from git tracking (local/environment-specific)

**Rationale:**
- `.claude/settings.json` contains user-specific Claude Code permissions and settings
- Should never be committed to git (similar to .vscode/, .idea/)
- Each developer should manage their own local settings
- Prevents merge conflicts from local configuration differences

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-30 10:36:51 +00:00
valtterimelkko
6a4ae2934b docs: Add deep analysis of bot not replying issue
- Root cause identified: Config file auto-rewriting triggers SIGUSR1 during message processing
- 6 potential solutions documented with implementation details
- Recommended action plan for immediate and long-term fixes

Analysis & Solutions section added to README_Tech.md.
2026-01-30 10:33:44 +00:00
valtterimelkko
5398b85a58 Docs: Add detailed user experience vs technical analysis for current issue
**Added comprehensive section showing:**
- User perspective: What you see when bot doesn't respond (typing indicator appears, then nothing)
- Technical reality: Exact timeline from logs showing message processing interrupted mid-tool-execution
- Visual timeline with precise timestamps (21:20:35 → 21:21:02)
- Actual log evidence from moltbot and PM2 logs
- Root problem chain: config rewrite → file watcher → reload handler → SIGUSR1 → shutdown
- The blocker: config file being automatically restored (unknown mechanism creating reload cycle)
- Verification table ruling out other causes

**Timeline captured:**
- T+0: Message received (21:20:35)
- T+1: Agent processing starts (21:20:36)
- T+14: First tool completes (21:20:49)
- T+27: Second tool starts (21:21:02)
- T+27.007: SIGUSR1 signal received - gateway self-terminates
- T+31: Gateway restarts with new PID (21:21:06)

**Key finding:** This is NOT a crash, it's a controlled graceful shutdown triggered by SIGUSR1 during message processing due to config file rewrites.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-30 10:14:05 +00:00
valtterimelkko
8035327cf3 Docs: Document PM2 daemon separation, investigation findings, and troubleshooting attempts
**Summary:**
- Documented critical PM2 daemon separation (moltbot isolated from SI Project)
- Added historical context explaining why separation was necessary (prevented 140+ dashboard crashes)
- Documented all three PM2 daemon locations and file paths for easier investigation
- Added comprehensive "Troubleshooting Attempts This Session" section detailing 8 investigation attempts
- Documented root cause of current issue: config auto-rewriting → file watcher → reload handler → SIGUSR1 → gateway shutdown during message processing
- Identified blocker: need to find what mechanism is auto-restoring config file after modifications
- Added "What Still Needs Investigation" with specific next debugging steps

**Technical Details:**
- Moltbot PM2 daemon: /root/.pm2 (isolated)
- SI Project PM2 daemon: /root/.pm2-si-project (completely separate)
- AI Product Visualizer: runs via code-server, not in any PM2 daemon
- Root cause: Gateway receives SIGUSR1 during message processing due to config file rewrites
- Pattern: config change → file watcher → reload handler → SIGUSR1 → graceful shutdown

**Files Changed:**
- README_Tech.md: Added system overview, PM2 paths, investigation details, and troubleshooting timeline

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-30 08:12:34 +00:00
valtterimelkko
db3535bc06 Doc: add Telegram plugin commands overflow fix documentation
Documented the plugin command registration overflow issue that caused
the Telegram bot to crash at startup. The fix disables plugin.entries.telegram
in the moltbot.json config to prevent extension commands from being
registered on Telegram (which has a 100-command API limit).

Issue occurred when too many installed extensions (Discord, Matrix, Mattermost, etc.)
tried to register their commands for Telegram, exceeding the limit.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 20:34:22 +00:00
valtterimelkko
38367cd87a Fix: disable plugin commands for Telegram to resolve BOT_COMMANDS_TOO_MUCH error
The bot was crashing during Telegram initialization because installed
extensions (Discord, Matrix, Mattermost, etc.) were registering their
plugin commands for Telegram, exceeding the 100-command API limit.

Disabled plugin.entries.telegram.enabled to prevent non-Telegram
extensions from registering commands on Telegram.

This fix allows the Telegram bot to initialize cleanly without
crashing when messages are received.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 20:33:26 +00:00
valtterimelkko
5900a08626 Docs: Document comprehensive gateway stability infrastructure
Added new section "Gateway Stability Infrastructure" covering:
- Multi-layer stability design (system, PM2, startup hooks, health monitoring)
- All monitoring commands with examples
- Recovery scenarios and automated responses
- What problems this prevents

This comprehensive infrastructure ensures:
- No more crashes from Telegram message processing
- Automatic detection and recovery from hangs
- Prevention of inotify exhaustion hangs
- Memory limit protection
- Clean lock file management
- Full visibility into gateway health

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 20:03:31 +00:00
valtterimelkko
dd7f826d0a Add PM2-native health monitoring and startup improvements
- Created scripts/gateway-start.sh: Startup wrapper that cleans stale lock files
  before starting the gateway (prevents "already running" errors)

- Created scripts/pm2-health-monitor.js: Standalone health check process managed by PM2
  * Monitors port 18789 connectivity every 5 minutes
  * Detects unresponsive gateway (process running but port hung)
  * Force-restarts via killall + PM2 auto-recovery
  * Monitors inotify watcher usage (warns at 80% of limit)
  * Logs to /tmp/moltbot/pm2-health-monitor.log

- Updated ecosystem.config.cjs to:
  * Use gateway-start.sh wrapper for lock cleanup
  * Add moltbot-health-monitor as separate PM2 app
  * Health monitor runs alongside gateway (same PM2 daemon, isolated from other daemons)

Key Design Principles:
- PM2 handles process lifecycle (restart, memory limits, crash recovery)
- Health monitor adds responsiveness detection (what PM2 can't do alone)
- No systemd involvement (prevents port conflicts with other PM2 instances)
- Each PM2 daemon isolated: moltbot-gateway, si_project/dashboard, ai_product_visualizer

This ensures gateway remains stable even if it becomes unresponsive to Telegram messages.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 20:03:06 +00:00
valtterimelkko
a37c9cad6d Fix: Remove systemd conflicts, clarify PM2-based process management
- Removed conflicting systemd service files (moltbot-gateway.service, moltbot-health-check.*)
- Removed redundant health-check script (PM2 handles restarts natively)
- Updated README_Tech.md to document PM2 as actual process manager
- Clarified that inotify fix (524288 limit) is permanent solution
- Documented PM2 commands for troubleshooting and monitoring
- Added safety note: Never use systemd for moltbot-gateway (causes port conflicts)
- Fixed architecture documentation to reflect PM2 daemon isolation model

Gateway now running cleanly via PM2 (PID 661291) without systemd interference.
Inotify limit verified at 524288 (prevents file watcher exhaustion).

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 19:51:16 +00:00
valtterimelkko
eec556c71e Fix: Resolve gateway crash loop and inotify exhaustion
Problem: Gateway was hung in 1200+ restart loop, causing Telegram bot to stop
responding. Root cause: system inotify file descriptor limit exhausted when
monitoring config/skill files.

Solutions implemented:

1. **Inotify limit increase** (/etc/sysctl.d/99-moltbot-inotify.conf)
   - Increased fs.inotify.max_user_watches from 65536 to 524288
   - Prevents "ENOSPC: System limit for number of file watchers reached"
   - Persistent across reboots

2. **Improved systemd service** (/etc/systemd/system/moltbot-gateway.service)
   - Changed Restart=always → Restart=on-failure
   - Increased RestartSec=5 → RestartSec=10 (reduce CPU churn)
   - Reduced StartLimitBurst=10 → StartLimitBurst=5
   - Added ExecStartPre to auto-clean stale locks on startup
   - Service remains isolated from other services (code-server, ssh, etc)

3. **Health check automation** (new files)
   - scripts/health-check-gateway.sh: detects hang/lock issues, auto-recovers
   - /etc/systemd/system/moltbot-health-check.service: runs health checks
   - /etc/systemd/system/moltbot-health-check.timer: runs every 5 minutes
   - Logs to /tmp/moltbot-health-check.log

4. **Documentation** (README_Tech.md)
   - Added section on crash loop root cause and preventative measures
   - Added Architecture section documenting service isolation
   - Updated troubleshooting with health check steps
   - Updated file locations with new monitoring files

Testing: Gateway now starts cleanly, health checks pass, other services
(code-server, ssh) remain unaffected. Timer runs every 5 minutes to prevent
future hangs.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 18:55:41 +00:00
valtterimelkko
c768d26ab3 Fix: properly ignore skills symlinks in gitignore
Changed skills/global-shared/ and skills/global-skills/ to remove trailing
slashes so gitignore properly ignores symlinks (not just directories).
Trailing slashes only match directories, not symlinks.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 15:33:35 +00:00
Valtteri Melkko
ab8540870b Implement task-type router with intelligent model selection and production setup
Major Changes:
- Implement task-type router (src/agents/task-type-router.ts) for intelligent model routing
  * Detects task type from user message (file-analysis, creative, debugging, cli, general)
  * Routes to optimal models: Gemini Flash (file analysis), Llama 3.3 70B (creative),
    Claude Sonnet 4.5 (debugging), Mistral Devstral 2 (CLI/general)
  * Integrated into model selection pipeline for seamless routing

- Integrate task-type routing into model resolution (src/agents/model-selection.ts)
  * Pass userMessage to resolveDefaultModelForAgent for context-aware routing
  * Maintain fallback chain for model availability

- Update attempt runner (src/agents/pi-embedded-runner/run/attempt.ts)
  * Pass prompt context to enable task-type based model selection

- Enhanced security and development (.gitignore)
  * Added comprehensive rules for sensitive files (.env variants, credentials)
  * Excluded API keys, runtime logs, test files, auto-generated skills directories
  * Properly ignored ecosystem.config, build artifacts, package manager locks

- Add technical documentation (README_Tech.md)
  * Process architecture (systemd Gateway, PM2 Dashboard, PM2 AI Product Visualizer)
  * Management commands and troubleshooting guide
  * Configuration summary and deployment checklist
  * Problem log with 6 documented issues and solutions

Result:
- Bot now intelligently routes user requests to optimal models based on message type
- Production-ready with systemd isolation, preventing PM2 conflicts
- Comprehensive documentation for future maintenance and troubleshooting
- Secure version control with quality .gitignore

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 15:27:12 +00:00
Josh Palmer
5f4715acfc fix flaky gateway tests in CI
What:
- resolve shell from PATH in bash-tools tests (avoid /bin/bash dependency)
- mock DNS for web-fetch SSRF tests (no real network)
- stub a2ui bundle in canvas-host server test when missing

Why:
- keep gateway test suite deterministic on Nix/Garnix Linux

Tests:
- not run locally (known missing deps in unit test run)
2026-01-29 12:14:27 +01:00
Josh Palmer
c41ea252b0 fix flaky web-fetch tests + lock cleanup
What:
- stub resolvePinnedHostname in web-fetch tests to avoid DNS flake
- close lock file handles via FileHandle.close during cleanup to avoid EBADF

Why:
- make CI deterministic without network/DNS dependence
- prevent double-close errors from GC

Tests:
- pnpm vitest run --config vitest.unit.config.ts src/agents/tools/web-tools.fetch.test.ts src/agents/session-write-lock.test.ts (failed: missing @aws-sdk/client-bedrock)
2026-01-29 11:05:11 +01:00
Tyler Yust
6372242da7
fix(ui): improve chat session dropdown and refresh behavior (#3682)
* refactor(ui): enhance loadSessions function to accept overrides for session loading parameters

- Updated loadSessions to include optional parameters for activeMinutes, limit, includeGlobal, and includeUnknown.
- Modified refreshChat to use the new activeMinutes parameter when loading sessions.
- Removed duplicate applySettingsFromUrl call in handleConnected function.

* feat(ui): implement session refresh functionality after chat

- Added `refreshSessionsAfterChat` property to `ChatHost` and `GatewayHost` types.
- Introduced `isChatResetCommand` function to identify chat reset commands.
- Updated `handleSendChat` to set `refreshSessions` based on chat reset commands.
- Modified `handleGatewayEventUnsafe` to load sessions when chat is finalized and `refreshSessionsAfterChat` is true.
- Enhanced `refreshChat` to load sessions with `activeMinutes` set to 0 for immediate refresh.
2026-01-28 23:24:46 -08:00
Ayaan Zaidi
718bc3f9c8
fix: avoid silent telegram empty replies (#3796) (#3796) 2026-01-29 11:34:47 +05:30
Conroy Whitney
c20035094d
fix: use & instead of <> in XML escaping test for Windows NTFS compatibility (#3750)
NTFS does not allow < or > in filenames, causing the XML filename
escaping test to fail on Windows CI with ENOENT.

Replace file<test>.txt with file&test.txt — & is valid on all platforms
and still requires XML escaping (&amp;), preserving the test's intent.

Fixes #3748
2026-01-29 05:46:50 +00:00
kiranjd
0761652701 fix(telegram): handle empty reply array in notifyEmptyResponse
Previous fix only checked skippedEmpty > 0, but when model returns
content: [] no payloads are created at all. Now also checks
replies.length === 0 to catch this case.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:13:39 +05:30
kiranjd
a2d06e75b0 fix(telegram): notify users when agent returns empty response
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 11:13:39 +05:30
Ayaan Zaidi
34291321b4 chore: update clawtributors (add @HirokiKobayashi-R) 2026-01-29 10:33:25 +05:30
Ayaan Zaidi
16a5549ec0 docs: update changelog for mention patterns (#3303) (thanks @HirokiKobayashi-R) 2026-01-29 10:31:47 +05:30
HirokiKobayashi-R
22b59d24ce fix(mentions): check mentionPatterns even when explicit mention is available 2026-01-29 10:31:47 +05:30
Ayaan Zaidi
fcc53bcf1b fix: include AccountId in telegram native command context (#2942) (thanks @Chloe-VP) 2026-01-29 10:17:25 +05:30
Chloe
6132c3d014 fix(telegram): include AccountId in native command context for multi-agent routing
When running multiple Telegram bot accounts bound to different agents,
the /new command (and other slash commands) would send confirmation
messages via the wrong bot because the context was missing AccountId.

The fix adds AccountId: route.accountId to the context payload in
registerTelegramNativeCommands, matching how bot-message-context.ts
handles regular messages.

Fixes #2537
2026-01-29 10:17:25 +05:30
Ayaan Zaidi
4ac7aa4a48 fix: handle telegram video notes (#2905) (thanks @mylukin) 2026-01-29 10:07:21 +05:30
Lukin
78722d0b4f fix(telegram): add video_note support to Telegram channel
- Add msg.video_note to media extraction chain in bot/delivery.ts
- Add placeholder detection for video notes in bot-message-context.ts
- Video notes (rounded square video messages) are now processed and downloaded like regular videos

Fixes issue where video note messages were silently dropped because they weren't in the media handling logic.
2026-01-29 10:07:21 +05:30
Clawdbot
c13c39f121 fix: exclude native slash commands from onToolResult
Native slash commands (e.g. /verbose, /status) should not emit tool
summaries. Gate onToolResult behind CommandSource !== 'native' in
addition to the existing ChatType !== 'group' check.

Add test for native command exclusion.
2026-01-29 09:50:39 +05:30
Clawdbot
e1ecfb25b8 test: add tests for onToolResult in DM vs group sessions
- provides onToolResult in DM sessions (ChatType=direct)
- does not provide onToolResult in group sessions (ChatType=group)
- sends tool results via dispatcher in DM sessions

Replaces the old cross-provider test that expected onToolResult to
always be undefined.
2026-01-29 09:50:39 +05:30
Clawdbot
f27a5030d8 fix: restore verbose tool summaries in DM sessions
875b018ea removed onToolResult from dispatch-from-config.ts to prevent
tool summaries leaking into group channels. However, this also broke
verbose tool summaries in DM/private sessions where they are expected.

This restores onToolResult but gates it behind ChatType !== 'group',
so group channels remain unaffected while DM verbose works again.

mirror=false is passed to sendPayloadAsync to avoid duplicating tool
summaries in the session transcript (matching the block reply behavior).

Fixes #2665
2026-01-29 09:50:39 +05:30
Gustavo Madeira Santana
699784dbee chore: remove stray package-lock.json 2026-01-28 22:00:55 -05:00
Gustavo Madeira Santana
a44da67069 fix: local updates for PR #3600
Co-authored-by: kira-ariaki <kira-ariaki@users.noreply.github.com>
2026-01-28 22:00:11 -05:00
Kira
0fd9d3abd1 feat(memory): add explicit paths config for memory search
Add a `paths` option to `memorySearch` config, allowing users to
explicitly specify additional directories or files to include in
memory search.

Follow-up to #2961 as suggested by @gumadeiras — instead of auto-following
symlinks (which has security implications), users can now explicitly
declare additional search paths.

- Add `memorySearch.paths` config option (array of strings)
- Paths can be absolute or relative (resolved from workspace)
- Directories are recursively scanned for `.md` files
- Single `.md` files can also be specified
- Paths from defaults and agent overrides are merged
- Added 4 test cases for listMemoryFiles
2026-01-28 22:00:11 -05:00
Shakker
b717724275
fix: add security hardening for media text attachments (#3700)
* fix: Prevent XML attribute injection by escaping special characters in file name and MIME type attributes.

* fix: text attachment MIME misclassification with security hardening (#3628)

- Fix CSV/TSV inference from content heuristics
- Add UTF-16 detection and BOM handling
- Add XML attribute escaping for file output (security)
- Add MIME override logging for auditability
- Add comprehensive test coverage for edge cases

Thanks @frankekn
2026-01-29 02:39:01 +00:00
Frank Yang
cb18ce7a85
Fix text attachment MIME misclassification (#3628)
* Fix text file attachment detection

* Add file attachment extraction tests
2026-01-29 02:33:03 +00:00
Gustavo Madeira Santana
a109b7f1a9 Update self message trust policy in WhatsApp docs
Clarified that self messages from the linked WhatsApp number bypass DM policy and allowFrom checks.
2026-01-28 20:31:33 -05:00
tewatia
4f554a1e31 docs(whatsapp): clarify self-message dmPolicy bypass
Self messages from the linked WhatsApp number bypass dmPolicy and allowFrom
checks automatically. Clarified that users don't need to add their own
number to the allowlist.

Self messages from the linked WhatsApp number bypass dmPolicy checks
entirely (via isSamePhone check in access-control.ts)...
2026-01-28 20:31:33 -05:00
jonisjongithub
fdcac0ccf4
fix: correct 'Venius' typo to 'Venice' in provider docs (#3638) - thanks (@jonisjongithub) 2026-01-28 23:51:43 +00:00
Shakker
3a9cfd787d
Merge pull request #3635 from moltbot/fix-token-input-trim
fix: trim whitespace from config input fields on change
2026-01-28 23:46:14 +00:00
Shakker
1c98b9dec8 fix(ui): trim whitespace from config input fields on change 2026-01-28 23:41:33 +00:00
Shakker
67f1402703 fix: tts base url runtime read (#3341) (thanks @hclsys) 2026-01-28 23:30:29 +00:00
Tyler Yust
a7534dc223
fix(ui): gateway URL confirmation modal (based on #2880) (#3578)
* fix: adding confirmation modal to confirm gateway url change

* refactor: added modal instead of confirm prompt

* fix(ui): reconnect after confirming gateway url (#2880) (thanks @0xacb)

---------

Co-authored-by: 0xacb <amccbaptista@gmail.com>
2026-01-28 13:32:10 -08:00
Gustavo Madeira Santana
109ac1c549 fix: banner spacing 2026-01-28 11:39:35 -05:00
Akshay
01e0d3a320
fix(cli): initialize plugins before pairing CLI registration (#3272)
The pairing CLI calls listPairingChannels() at registration time,
which requires the plugin registry to be populated. Without this,
plugin-provided channels like Matrix fail with "does not support
pairing" even though they have pairing adapters defined.

This mirrors the existing pattern used by the plugins CLI entry.

Co-authored-by: Shakker <165377636+shakkernerd@users.noreply.github.com>
2026-01-28 13:26:25 +00:00
Shakker
da421b9ef7
Merge pull request #3316 from bguidolim/fix/mime-types-audio-video
fix(media): add missing MIME type mappings for audio/video files
2026-01-28 12:31:47 +00:00
Bruno Guidolim
57efd8e083 fix(media): add missing MIME type mappings for audio/video files
Add mappings for audio/x-m4a, audio/mp4, and video/quicktime to ensure
media files sent as documents are saved with proper extensions, enabling
automatic transcription/analysis tools to work correctly.

- audio/x-m4a → .m4a
- audio/mp4 → .m4a
- video/quicktime → .mov

Also adds comprehensive test coverage for extensionForMime().
2026-01-28 13:17:50 +01:00
Roopak Nijhara
d93f8ffc13 fix: use fileURLToPath for Windows compatibility 2026-01-28 16:42:39 +05:30
Roopak Nijhara
bffcef981d style: run pnpm format 2026-01-28 16:42:39 +05:30
Roopak Nijhara
39b7f9d581 feat(hooks): make session-memory message count configurable (#2681)
Adds `messages` config option to session-memory hook (default: 15).
Fixes filter order bug - now filters user/assistant messages first,
then slices to get exactly N messages. Previously sliced first which
could result in fewer messages when non-message entries were present.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-28 16:42:39 +05:30
Shadow
9688454a30
Accidental inclusion 2026-01-28 01:12:04 -06:00
Shadow
6044bf3637
Discord: fix resolveDiscordTarget parse options 2026-01-28 00:37:21 -06:00