romtuck/openclaw

Fork 0

Interpool 17609d3e6e feat: include weekday in system prompt and add zh-CN tool display (#4629 , #4631 )

2026-01-30 19:22:57 +05:30

11 KiB

Raw Blame History

name

description

metadata

streaming-avatars

Real-time interactive avatar sessions for HeyGen

tags
streaming, real-time, interactive, websocket, live

Streaming Avatars

HeyGen's Streaming Avatar feature enables real-time interactive avatar experiences, perfect for live customer service, virtual assistants, and interactive applications.

Overview

Streaming avatars provide:

Real-time rendering - Avatar responds immediately to input
Interactive dialogue - Two-way conversation capability
WebRTC streaming - Low-latency video delivery
Session management - Create and manage live sessions

Creating a Streaming Session

Request Fields

Field	Type	Req	Description
`avatar_id`	string	✓	Avatar to use for streaming
`voice_id`	string	✓	Voice for TTS responses
`quality`	string		`"low"`, `"medium"`, or `"high"`
`video_encoding`	string		`"H264"` or `"VP8"`

curl

curl -X POST "https://api.heygen.com/v1/streaming.new" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "avatar_id": "josh_lite3_20230714",
    "voice_id": "1bd001e7e50f421d891986aad5158bc8",
    "quality": "high"
  }'

TypeScript

interface StreamingSessionRequest {
  avatar_id: string;                           // Required
  voice_id: string;                            // Required
  quality?: "low" | "medium" | "high";
  video_encoding?: "H264" | "VP8";
}

interface StreamingSessionResponse {
  error: null | string;
  data: {
    session_id: string;
    access_token: string;
    url: string;
    ice_servers: IceServer[];
  };
}

interface IceServer {
  urls: string[];
  username?: string;
  credential?: string;
}

async function createStreamingSession(
  config: StreamingSessionRequest
): Promise<StreamingSessionResponse["data"]> {
  const response = await fetch("https://api.heygen.com/v1/streaming.new", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify(config),
  });

  const json: StreamingSessionResponse = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }

  return json.data;
}

Python

import requests
import os

def create_streaming_session(config: dict) -> dict:
    response = requests.post(
        "https://api.heygen.com/v1/streaming.new",
        headers={
            "X-Api-Key": os.environ["HEYGEN_API_KEY"],
            "Content-Type": "application/json"
        },
        json=config
    )

    data = response.json()
    if data.get("error"):
        raise Exception(data["error"])

    return data["data"]

Session Quality Options

Quality	Resolution	Bandwidth	Use Case
`low`	480p	~500kbps	Limited bandwidth
`medium`	720p	~1Mbps	Standard applications
`high`	1080p	~2Mbps	Premium experiences

Sending Text to Avatar

Make the avatar speak in real-time:

Request Fields

Field	Type	Req	Description
`session_id`	string	✓	Active streaming session ID
`text`	string	✓	Text for avatar to speak
`task_type`	string	✓	`"talk"` or `"repeat"`
`task_mode`	string		`"sync"` or `"async"`

curl

curl -X POST "https://api.heygen.com/v1/streaming.task" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your_session_id",
    "text": "Hello! How can I help you today?",
    "task_type": "talk"
  }'

TypeScript

interface StreamingTaskRequest {
  session_id: string;                          // Required
  text: string;                                // Required
  task_type: "talk" | "repeat";                // Required
  task_mode?: "sync" | "async";
}

async function sendTextToAvatar(
  sessionId: string,
  text: string
): Promise<void> {
  const response = await fetch("https://api.heygen.com/v1/streaming.task", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      session_id: sessionId,
      text,
      task_type: "talk",
    }),
  });

  const json = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }
}

Stopping a Session

curl

curl -X POST "https://api.heygen.com/v1/streaming.stop" \
  -H "X-Api-Key: $HEYGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "your_session_id"
  }'

TypeScript

async function stopStreamingSession(sessionId: string): Promise<void> {
  const response = await fetch("https://api.heygen.com/v1/streaming.stop", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ session_id: sessionId }),
  });

  const json = await response.json();

  if (json.error) {
    throw new Error(json.error);
  }
}

WebRTC Integration

Connect to the streaming avatar using WebRTC:

class StreamingAvatarClient {
  private peerConnection: RTCPeerConnection | null = null;
  private sessionId: string | null = null;

  async connect(avatarId: string, voiceId: string): Promise<MediaStream> {
    // 1. Create session
    const session = await createStreamingSession({
      avatar_id: avatarId,
      voice_id: voiceId,
      quality: "high",
    });

    this.sessionId = session.session_id;

    // 2. Set up WebRTC peer connection
    this.peerConnection = new RTCPeerConnection({
      iceServers: session.ice_servers,
    });

    // 3. Handle incoming video stream
    const mediaStream = new MediaStream();

    this.peerConnection.ontrack = (event) => {
      event.streams[0].getTracks().forEach((track) => {
        mediaStream.addTrack(track);
      });
    };

    // 4. Create and set offer
    const offer = await this.peerConnection.createOffer();
    await this.peerConnection.setLocalDescription(offer);

    // 5. Exchange SDP with server (implementation depends on signaling method)
    // ...

    return mediaStream;
  }

  async speak(text: string): Promise<void> {
    if (!this.sessionId) {
      throw new Error("Not connected");
    }
    await sendTextToAvatar(this.sessionId, text);
  }

  async disconnect(): Promise<void> {
    if (this.sessionId) {
      await stopStreamingSession(this.sessionId);
      this.sessionId = null;
    }

    if (this.peerConnection) {
      this.peerConnection.close();
      this.peerConnection = null;
    }
  }
}

React Integration Example

import React, { useRef, useEffect, useState } from "react";

interface StreamingAvatarProps {
  avatarId: string;
  voiceId: string;
}

function StreamingAvatar({ avatarId, voiceId }: StreamingAvatarProps) {
  const videoRef = useRef<HTMLVideoElement>(null);
  const [client] = useState(() => new StreamingAvatarClient());
  const [connected, setConnected] = useState(false);
  const [inputText, setInputText] = useState("");

  useEffect(() => {
    async function connect() {
      try {
        const stream = await client.connect(avatarId, voiceId);
        if (videoRef.current) {
          videoRef.current.srcObject = stream;
        }
        setConnected(true);
      } catch (error) {
        console.error("Failed to connect:", error);
      }
    }

    connect();

    return () => {
      client.disconnect();
    };
  }, [avatarId, voiceId]);

  const handleSend = async () => {
    if (inputText.trim()) {
      await client.speak(inputText);
      setInputText("");
    }
  };

  return (
    <div>
      <video
        ref={videoRef}
        autoPlay
        playsInline
        style={{ width: "100%", maxWidth: 640 }}
      />
      {connected && (
        <div>
          <input
            type="text"
            value={inputText}
            onChange={(e) => setInputText(e.target.value)}
            placeholder="Type message..."
          />
          <button onClick={handleSend}>Send</button>
        </div>
      )}
    </div>
  );
}

Session Management

Listing Active Sessions

async function listActiveSessions(): Promise<string[]> {
  const response = await fetch("https://api.heygen.com/v1/streaming.list", {
    headers: { "X-Api-Key": process.env.HEYGEN_API_KEY! },
  });

  const json = await response.json();
  return json.data.sessions;
}

Session Timeout

Sessions automatically timeout after inactivity. Send periodic keep-alive messages:

class StreamingAvatarClient {
  private keepAliveInterval: NodeJS.Timer | null = null;

  startKeepAlive() {
    this.keepAliveInterval = setInterval(async () => {
      if (this.sessionId) {
        await this.ping();
      }
    }, 30000); // Every 30 seconds
  }

  private async ping(): Promise<void> {
    // Implementation depends on API
  }

  stopKeepAlive() {
    if (this.keepAliveInterval) {
      clearInterval(this.keepAliveInterval);
      this.keepAliveInterval = null;
    }
  }
}

Interrupting Speech

Interrupt current speech to start new content:

async function interruptAndSpeak(
  sessionId: string,
  newText: string
): Promise<void> {
  // Interrupt current speech
  await fetch("https://api.heygen.com/v1/streaming.interrupt", {
    method: "POST",
    headers: {
      "X-Api-Key": process.env.HEYGEN_API_KEY!,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ session_id: sessionId }),
  });

  // Send new text
  await sendTextToAvatar(sessionId, newText);
}

Use Cases

Virtual Customer Service

async function handleCustomerQuery(
  sessionId: string,
  query: string
): Promise<void> {
  // Process query with your AI/logic
  const response = await processCustomerQuery(query);

  // Have avatar speak the response
  await sendTextToAvatar(sessionId, response);
}

Interactive Training

async function conductTrainingSession(
  sessionId: string,
  trainingScript: string[]
): Promise<void> {
  for (const segment of trainingScript) {
    await sendTextToAvatar(sessionId, segment);

    // Wait for avatar to finish speaking
    await waitForSpeechComplete(sessionId);

    // Pause between segments
    await new Promise((r) => setTimeout(r, 1000));
  }
}

Best Practices

Handle disconnections - Implement reconnection logic
Manage bandwidth - Adjust quality based on connection
Limit session duration - Close unused sessions to save credits
Test latency - Ensure acceptable response times
Provide fallback - Have backup for streaming failures
Monitor usage - Track session duration and costs

Limitations

Session duration limits vary by plan
Concurrent session limits apply
Credits consumed based on session time
WebRTC requires modern browser support
Network quality affects experience

11 KiB Raw Blame History

Streaming Avatars

Overview

Creating a Streaming Session

Request Fields

curl

TypeScript

Python

Session Quality Options

Sending Text to Avatar

Request Fields

curl

TypeScript

Stopping a Session

curl

TypeScript

WebRTC Integration

React Integration Example

Session Management

Listing Active Sessions

Session Timeout

Interrupting Speech

Use Cases

Virtual Customer Service

Interactive Training

Best Practices

Limitations

11 KiB

Raw Blame History