romtuck/openclaw

Fork 0

AlexKer 1afcb09d35 baseten provider

2026-01-29 15:59:05 -05:00

6.5 KiB

Raw Blame History

summary

read_when

Use Baseten Model APIs for high-performance LLMs in Moltbot

You want high-performance LLM inference in Moltbot

You want Baseten Model APIs setup guidance

Baseten

Baseten provides Model APIs for instant access to high-performance LLMs through OpenAI-compatible endpoints. Point your existing OpenAI SDK at Baseten's inference endpoint and start making calls—no model deployment required.

Why Baseten in Moltbot

High-performance LLMs with optimized serving infrastructure.
Wide model selection including DeepSeek V3.2, GPT OSS 120B, Kimi K2, Qwen3 Coder, GLM-4.7, and more.
OpenAI-compatible API - standard /v1 endpoints for easy integration.
Serverless - no infrastructure management, pay per token.

Features

Model APIs: Instant access to high-performance LLMs without deployment
OpenAI-compatible API: Standard /v1 endpoints for easy integration
Streaming: Supported on all models
Function calling: Supported on select models (check model capabilities)
Structured outputs: Generate JSON that conforms to a schema
Reasoning: Control extended thinking for reasoning-capable models

Setup

1. Get API Key

Sign up at baseten.co
Go to Settings > API Keys > Create API Key
Copy your API key

2. Configure Moltbot

Option A: Environment Variable

export BASETEN_API_KEY="your-api-key-here"

Option B: Interactive Setup (Recommended)

moltbot onboard --auth-choice baseten-api-key

This will:

Prompt for your API key (or use existing BASETEN_API_KEY)
Configure the Baseten provider with available models
Let you pick your default model
Set up the provider automatically

Option C: Non-interactive

moltbot onboard --non-interactive \
  --auth-choice baseten-api-key \
  --baseten-api-key "your-api-key-here"

3. Verify Setup

moltbot chat --model baseten/deepseek-ai/DeepSeek-V3.2 "Hello, are you working?"

Model Selection

Moltbot includes a curated catalog of popular Baseten Model API models. Pick based on your needs:

Default: deepseek-ai/DeepSeek-V3.2 (DeepSeek V3.2) - general purpose, 131k context.
Best reasoning: openai/gpt-oss-120b or moonshotai/Kimi-K2-Thinking
Coding: Qwen/Qwen3-Coder-480B-A35B-Instruct
Long context: moonshotai/Kimi-K2-Thinking (262k context)

Change your default model anytime:

moltbot models set baseten/deepseek-ai/DeepSeek-V3.2
moltbot models set baseten/openai/gpt-oss-120b

List all available models:

moltbot models list | grep baseten

Which Model Should I Use?

Use Case	Recommended Model	Why
General chat	`deepseek-ai/DeepSeek-V3.2`	Balanced performance, 131k context
Complex reasoning	`openai/gpt-oss-120b`	Best for step-by-step reasoning
Agentic tasks	`openai/gpt-oss-120b`	Designed for reasoning and agentic use
Coding	`Qwen/Qwen3-Coder-480B-A35B-Instruct`	Code-optimized, 262k context
Long context	`moonshotai/Kimi-K2-Thinking`	262k context window
Reasoning	`zai-org/GLM-4.7`	Advanced thinking controls

Available Models (9 Total)

Text Models

Model ID	Name	Context	Features
`openai/gpt-oss-120b`	OpenAI GPT OSS 120B	128k	Reasoning
`deepseek-ai/DeepSeek-V3.2`	DeepSeek V3.2	131k	General
`deepseek-ai/DeepSeek-V3.1`	DeepSeek V3.1	164k	General
`deepseek-ai/DeepSeek-V3-0324`	DeepSeek V3 0324	164k	General
`moonshotai/Kimi-K2-Thinking`	Kimi K2 Thinking	262k	Reasoning
`moonshotai/Kimi-K2-Instruct-0905`	Kimi K2 Instruct 0905	128k	Long context
`Qwen/Qwen3-Coder-480B-A35B-Instruct`	Qwen3 Coder 480B A35B Instruct	262k	Coding
`zai-org/GLM-4.7`	GLM-4.7	200k	Reasoning
`zai-org/GLM-4.6`	GLM-4.6	200k	Reasoning

Model IDs

Baseten model IDs use the format:

<org>/<model-name>

When using models in Moltbot, prefix with the provider:

moltbot chat --model baseten/deepseek-ai/DeepSeek-V3.2

Streaming and Tool Support

Feature	Support
Streaming	All models
Function calling	Select models (check model capabilities)
Structured outputs	Supported via `response_format`
Reasoning	Supported on reasoning-capable models

Pricing

Baseten uses pay-per-token pricing. Check baseten.co for current rates. Generally:

Smaller models: Lower cost, faster
Larger models: Higher quality, higher cost
Reasoning models: May have additional costs for extended thinking

Usage Examples

# Use DeepSeek V3.2 (recommended default)
moltbot chat --model baseten/deepseek-ai/DeepSeek-V3.2

# Use GPT OSS 120B for reasoning
moltbot chat --model baseten/openai/gpt-oss-120b

# Use coding model
moltbot chat --model baseten/Qwen/Qwen3-Coder-480B-A35B-Instruct

# Use reasoning model
moltbot chat --model baseten/moonshotai/Kimi-K2-Thinking

Troubleshooting

API key not recognized

echo $BASETEN_API_KEY
moltbot models list | grep baseten

Ensure the key is valid and has not expired.

Model not available

Run moltbot models list to see currently available models in the catalog. If a model you need is missing, you can add it manually to your config file.

Connection issues

Baseten API is at https://inference.baseten.co. Ensure your network allows HTTPS connections.

Config file example

{
  env: { BASETEN_API_KEY: "..." },
  agents: { defaults: { model: { primary: "baseten/deepseek-ai/DeepSeek-V3.2" } } },
  models: {
    mode: "merge",
    providers: {
      baseten: {
        baseUrl: "https://inference.baseten.co/v1",
        apiKey: "${BASETEN_API_KEY}",
        api: "openai-completions",
        models: [
          {
            id: "deepseek-ai/DeepSeek-V3.2",
            name: "DeepSeek V3.2",
            reasoning: false,
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 131072,
            maxTokens: 8192
          }
        ]
      }
    }
  }
}

6.5 KiB Raw Blame History