- Update type signatures in onboard-auth.config-core.ts - Update product name references in litellm.md docs - Update CLI command examples from clawdbot to openclaw
3.4 KiB
3.4 KiB
| summary | read_when | |||
|---|---|---|---|---|
| Use LiteLLM as an OpenAI-compatible proxy in OpenClaw |
|
LiteLLM
LiteLLM is an OpenAI-compatible proxy that supports 100+ LLM APIs. OpenClaw
registers it as the litellm provider and uses the OpenAI Completions API.
Quick setup
- Set up your LiteLLM proxy (see LiteLLM docs)
- Set environment variables (optional):
LITELLM_API_KEY- your LiteLLM API keyLITELLM_BASE_URL- your LiteLLM endpoint (default:http://localhost:4000)LITELLM_MODEL- default model name (default:gpt-4)
- Run onboarding:
openclaw onboard --auth-choice litellm-api-key
The wizard will prompt for:
- Base URL (your LiteLLM proxy endpoint)
- API key
- Model name (as configured in your LiteLLM proxy)
Config example
{
env: { LITELLM_API_KEY: "sk-..." },
agents: {
defaults: {
model: { primary: "litellm/gpt-4" },
models: { "litellm/gpt-4": { alias: "GPT-4" } }
}
},
models: {
mode: "merge",
providers: {
litellm: {
baseUrl: "http://localhost:4000",
apiKey: "${LITELLM_API_KEY}",
api: "openai-completions",
models: [
{
id: "gpt-4",
name: "GPT-4",
reasoning: false,
input: ["text"],
contextWindow: 128000,
maxTokens: 8192
}
]
}
}
}
}
Multiple models
Add additional models to your config as needed:
{
models: {
providers: {
litellm: {
baseUrl: "http://localhost:4000",
apiKey: "${LITELLM_API_KEY}",
api: "openai-completions",
models: [
{ id: "gpt-4", name: "GPT-4", contextWindow: 128000, maxTokens: 8192 },
{ id: "claude-3-opus", name: "Claude Opus", contextWindow: 200000, maxTokens: 4096 },
{ id: "gemini-pro", name: "Gemini Pro", contextWindow: 32000, maxTokens: 8192 }
]
}
}
}
}
Then switch models using:
openclaw config set agents.defaults.model.primary litellm/claude-3-opus
Prompt caching
When using Anthropic models through LiteLLM (e.g., claude-opus-4-5, claude-sonnet-4-5), OpenClaw automatically enables prompt caching to reduce costs:
{
agents: {
defaults: {
models: {
"litellm/claude-opus-4-5": {
params: {
cacheControlTtl: "1h" // Auto-configured for Claude models
}
}
}
}
}
}
Cost savings with caching
- Without caching: Every message pays full price for the entire conversation history
- With caching (enabled by default): Repeated context costs 10x less
Example from actual usage:
- Without caching: 93k tokens × $0.000005 = $0.47 per message
- With caching: 123k tokens (mostly cached) = $0.05 per message (90% savings!)
Caching is automatically enabled for all claude-* models through LiteLLM.
Notes
- Model refs use
litellm/<modelId>wheremodelIdmatches your LiteLLM config. - The base URL should not include
/v1- OpenClaw's OpenAI client appends it. - Supported LiteLLM models depend on your proxy configuration.
- Prompt caching works automatically when using Claude models through LiteLLM.
- See Model providers for provider rules.