valtterimelkko
dd7f826d0a
Add PM2-native health monitoring and startup improvements
...
- Created scripts/gateway-start.sh: Startup wrapper that cleans stale lock files
before starting the gateway (prevents "already running" errors)
- Created scripts/pm2-health-monitor.js: Standalone health check process managed by PM2
* Monitors port 18789 connectivity every 5 minutes
* Detects unresponsive gateway (process running but port hung)
* Force-restarts via killall + PM2 auto-recovery
* Monitors inotify watcher usage (warns at 80% of limit)
* Logs to /tmp/moltbot/pm2-health-monitor.log
- Updated ecosystem.config.cjs to:
* Use gateway-start.sh wrapper for lock cleanup
* Add moltbot-health-monitor as separate PM2 app
* Health monitor runs alongside gateway (same PM2 daemon, isolated from other daemons)
Key Design Principles:
- PM2 handles process lifecycle (restart, memory limits, crash recovery)
- Health monitor adds responsiveness detection (what PM2 can't do alone)
- No systemd involvement (prevents port conflicts with other PM2 instances)
- Each PM2 daemon isolated: moltbot-gateway, si_project/dashboard, ai_product_visualizer
This ensures gateway remains stable even if it becomes unresponsive to Telegram messages.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 20:03:06 +00:00
valtterimelkko
a37c9cad6d
Fix: Remove systemd conflicts, clarify PM2-based process management
...
- Removed conflicting systemd service files (moltbot-gateway.service, moltbot-health-check.*)
- Removed redundant health-check script (PM2 handles restarts natively)
- Updated README_Tech.md to document PM2 as actual process manager
- Clarified that inotify fix (524288 limit) is permanent solution
- Documented PM2 commands for troubleshooting and monitoring
- Added safety note: Never use systemd for moltbot-gateway (causes port conflicts)
- Fixed architecture documentation to reflect PM2 daemon isolation model
Gateway now running cleanly via PM2 (PID 661291) without systemd interference.
Inotify limit verified at 524288 (prevents file watcher exhaustion).
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 19:51:16 +00:00
valtterimelkko
eec556c71e
Fix: Resolve gateway crash loop and inotify exhaustion
...
Problem: Gateway was hung in 1200+ restart loop, causing Telegram bot to stop
responding. Root cause: system inotify file descriptor limit exhausted when
monitoring config/skill files.
Solutions implemented:
1. **Inotify limit increase** (/etc/sysctl.d/99-moltbot-inotify.conf)
- Increased fs.inotify.max_user_watches from 65536 to 524288
- Prevents "ENOSPC: System limit for number of file watchers reached"
- Persistent across reboots
2. **Improved systemd service** (/etc/systemd/system/moltbot-gateway.service)
- Changed Restart=always → Restart=on-failure
- Increased RestartSec=5 → RestartSec=10 (reduce CPU churn)
- Reduced StartLimitBurst=10 → StartLimitBurst=5
- Added ExecStartPre to auto-clean stale locks on startup
- Service remains isolated from other services (code-server, ssh, etc)
3. **Health check automation** (new files)
- scripts/health-check-gateway.sh: detects hang/lock issues, auto-recovers
- /etc/systemd/system/moltbot-health-check.service: runs health checks
- /etc/systemd/system/moltbot-health-check.timer: runs every 5 minutes
- Logs to /tmp/moltbot-health-check.log
4. **Documentation** (README_Tech.md)
- Added section on crash loop root cause and preventative measures
- Added Architecture section documenting service isolation
- Updated troubleshooting with health check steps
- Updated file locations with new monitoring files
Testing: Gateway now starts cleanly, health checks pass, other services
(code-server, ssh) remain unaffected. Timer runs every 5 minutes to prevent
future hangs.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-01-29 18:55:41 +00:00
Peter Steinberger
7eb57b691c
chore: prep 2026.1.27-beta.1 release
2026-01-28 01:35:58 +01:00
Pooya Parsa
4a1b6bc008
update refs
2026-01-27 13:50:46 -08:00
Shadow
f7a0b0934d
Branding: update bot.molt bundle IDs + launchd labels
2026-01-27 14:46:50 -06:00
Shadow
cc72498b46
Mac: finish Moltbot rename
2026-01-27 14:12:17 -06:00
Peter Steinberger
640c8d1554
fix: ignore windows vitest worker crashes
2026-01-27 17:37:21 +00:00
Peter Steinberger
4a9c921168
fix: use threads pool for windows ci tests
2026-01-27 17:02:01 +00:00
Peter Steinberger
cf334d3b7d
fix: shard windows ci test runs
2026-01-27 16:39:28 +00:00
Peter Steinberger
240232aed1
fix: run windows ci tests serially
2026-01-27 16:13:02 +00:00
Peter Steinberger
3817e0ce2c
fix: bundle a2ui before tests
2026-01-27 15:38:31 +00:00
Peter Steinberger
e4518d2271
fix: allow docker builds to skip missing a2ui assets
2026-01-27 15:16:20 +00:00
Peter Steinberger
0594ccf92a
fix: skip a2ui bundling when sources are excluded
2026-01-27 15:01:57 +00:00
Peter Steinberger
3015e11fd7
fix: stabilize install smoke against clawdbot installer
2026-01-27 14:58:01 +00:00
Peter Steinberger
6d16a658e5
refactor: rename clawdbot to moltbot with legacy compat
2026-01-27 12:21:02 +00:00
Peter Steinberger
83460df96f
chore: update molt.bot domains
2026-01-27 12:21:01 +00:00
Gustavo Madeira Santana
2044b3ca8d
Build: restore A2UI scaffold assets ( #2455 ) (thanks @0oAstro)
...
Co-authored-by: 0oAstro <0oAstro@users.noreply.github.com>
2026-01-26 23:08:25 -05:00
Gustavo Madeira Santana
c2a4863b15
Build: stop tracking bundled artifacts ( #2455 ) (thanks @0oAstro)
...
Co-authored-by: 0oAstro <0oAstro@users.noreply.github.com>
2026-01-26 23:08:25 -05:00
Peter Steinberger
fba7afaa12
chore(scripts): update claude auth status hints
2026-01-26 19:05:00 +00:00
Shadow
e040f6338a
Docs: update clawtributors list
2026-01-25 22:38:04 -06:00
Shadow
b25fcaef0f
CI: parse labeler without deps
2026-01-25 20:38:44 -06:00
Shadow
6b6284c69c
CI: add PR labeler + label sync
2026-01-25 20:37:31 -06:00
Peter Steinberger
71eb6d5dd0
fix(imessage): normalize messaging targets ( #1708 )
...
Co-authored-by: Aaron Ng <1653630+aaronn@users.noreply.github.com>
2026-01-25 13:43:32 +00:00
plum-dawg
c96ffa7186
feat: Add Line plugin ( #1630 )
...
* feat: add LINE plugin (#1630 ) (thanks @plum-dawg)
* feat: complete LINE plugin (#1630 ) (thanks @plum-dawg)
* chore: drop line plugin node_modules (#1630 ) (thanks @plum-dawg)
* test: mock /context report in commands test (#1630 ) (thanks @plum-dawg)
* test: limit macOS CI workers to avoid OOM (#1630 ) (thanks @plum-dawg)
* test: reduce macOS CI vitest workers (#1630 ) (thanks @plum-dawg)
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-01-25 12:22:36 +00:00
Peter Steinberger
50f233d16d
chore: stabilize prek hooks runner selection ( #1720 ) (thanks @dguido)
2026-01-25 10:55:28 +00:00
Dan Guido
48aea87028
feat: add prek pre-commit hooks and dependabot ( #1720 )
...
* feat: add prek pre-commit hooks and dependabot
Pre-commit hooks (via prek):
- Basic hygiene: trailing-whitespace, end-of-file-fixer, check-yaml, check-added-large-files, check-merge-conflict
- Security: detect-secrets, zizmor (GitHub Actions audit)
- Linting: shellcheck, actionlint, oxlint, swiftlint
- Formatting: oxfmt, swiftformat
Dependabot:
- npm and GitHub Actions ecosystems
- Grouped updates (production/development/actions)
- 7-day cooldown for supply chain protection
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* docs: add prek install instruction to AGENTS.md
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 10:53:23 +00:00
Rohan Nagpal
06a7e1e8ce
Telegram: threaded conversation support ( #1597 )
...
* Telegram: isolate dm topic sessions
* Tests: cap vitest workers
* Tests: cap Vitest workers on CI macOS
* Tests: avoid timer-based pi-ai stream mock
* Tests: increase embedded runner timeout
* fix: harden telegram dm thread handling (#1597 ) (thanks @rohannagpal)
---------
Co-authored-by: Peter Steinberger <steipete@gmail.com>
2026-01-25 04:48:51 +00:00
Peter Steinberger
6d79c6cd26
fix: clean docker onboarding warnings + preserve agentId casing
2026-01-24 19:07:01 +00:00
Luke
be1cdc9370
fix(agents): treat provider request-aborted as timeout for fallback ( #1576 )
...
* fix(agents): treat request-aborted as timeout for fallback
* test(e2e): add provider timeout fallback
2026-01-24 11:27:24 +00:00
Peter Steinberger
4a9123d415
chore: suppress remaining deprecation warnings
2026-01-24 11:16:46 +00:00
Peter Steinberger
8b7b7e154f
chore: speed up tests and update opencode models
2026-01-23 11:36:32 +00:00
Peter Steinberger
bb9bddebb4
fix: stabilize ci tests
2026-01-23 09:52:22 +00:00
Peter Steinberger
3de5ea818d
ci: speed up install smoke on PRs
2026-01-23 09:05:15 +00:00
Peter Steinberger
86e0916fa3
fix: allow windows spawn in test parallel
2026-01-23 07:52:04 +00:00
Peter Steinberger
45ce07a098
test: split vitest into unit and gateway
2026-01-23 07:34:57 +00:00
Peter Steinberger
2c10c601a8
test: harden docker onboarding waits
2026-01-23 05:10:59 +00:00
Tak hoffman
b65916e0d1
CLI: fix Windows gateway startup
2026-01-23 04:47:01 +00:00
Peter Steinberger
7c336588ea
chore: drop tty from install e2e docker
2026-01-22 23:09:28 +00:00
Peter Steinberger
573354f5e4
chore(dev): default restart-mac to attach-only
2026-01-22 23:08:56 +00:00
Peter Steinberger
8a20f44228
fix: improve gateway ssh auth handling
2026-01-22 06:54:08 +00:00
Peter Steinberger
50049fd220
chore(macos): drop time-sensitive notification entitlement toggle
2026-01-22 04:50:03 +00:00
Peter Steinberger
ff3d8cab2b
feat: preflight update runner before rebase
2026-01-22 04:19:33 +00:00
Peter Steinberger
2e1514095d
fix: package Textual resources for mac app
2026-01-22 02:34:27 +00:00
Clawd
429a2d7849
fix(mac): default to universal binary for distribution builds
...
Closes #1393
The distribution script (package-mac-dist.sh) now defaults BUILD_ARCHS to 'all',
producing universal binaries that run natively on both Apple Silicon and Intel Macs.
Previously, the script inherited the host architecture default from package-mac-app.sh,
which meant release builds done on ARM Macs only included ARM binaries.
2026-01-22 00:29:27 +00:00
Peter Steinberger
6c0a01dc90
fix: bundle mac model catalog
2026-01-21 19:58:19 +00:00
Peter Steinberger
fb47f1cbeb
chore: rename clawlog references
2026-01-21 05:53:32 +00:00
Peter Steinberger
58b131919f
feat: use tsgo for dev/watch builds
2026-01-21 04:06:09 +00:00
Peter Steinberger
aec622fe63
chore: remove fresh dist log
2026-01-21 03:13:50 +00:00
Peter Steinberger
dfbf6ac263
feat: enforce device-bound connect challenge
2026-01-20 13:04:19 +00:00