Safety margin was applied twice:
1. threshold = (context - reserve) / 1.2
2. estimated = (existing + prompt) * 1.2
This caused compaction to trigger too early (~62% instead of ~83%).
Now only applied once in threshold calculation.
Check session tokens before sending API request. If estimated tokens
exceed threshold (default 85% of context), run compaction first to
avoid context overflow errors.
This prevents the scenario where a large new prompt causes overflow
and there's not enough space to even run compaction.
New config options:
- compaction.proactiveEnabled (default: true)
- compaction.proactiveThresholdRatio (default: 0.85)
Co-Authored-By: Claude <noreply@anthropic.com>