How It Works
The AI Avatar tile generates a virtual presenter based on the script and settings you provide. You choose the avatar’s appearance, voice, and delivery style, and Mosaic produces a video of the avatar speaking your script with natural lip sync and gestures.
Input & Settings
Avatar Selection
Choose from a library of AI-generated presenters with different appearances, genders, and styles. Use cases:- Corporate training and onboarding videos
- Product explainers and demos
- Educational content and tutorials
- Social media content where you don’t want to be on camera
Script
Provide the text for the avatar to speak. Best practices:- Keep sentences concise and conversational
- Use punctuation for natural pauses
- Break long scripts into logical sections
- Write as if speaking to someone directly
Voice
Select a voice style that matches the avatar and your content’s tone. Options vary by avatar, but typically include:- Professional — clear and authoritative
- Friendly — warm and approachable
- Energetic — upbeat and engaging
Language
Choose the language for the avatar’s speech. Multiple languages are supported for global content creation.Usage Recommendations
Use AI Avatar to:- Create presenter-led videos without filming
- Produce multilingual content at scale
- Add a human touch to explainer or tutorial videos
- Build consistent brand spokesperson content
- AI B-Roll (add visual variety behind the avatar)
- AI Music (add background score)
- Captions (add subtitles for accessibility)
- Reframe (adapt to different platforms)
API Info
Node Params & API Details
Node Params & API Details
- Node ID:
b3b4c9e2-2a47-4fa9-8ce8-0c1fa1d7b6ef
Node params
| Param | Type | Required | Default | Notes |
|---|---|---|---|---|
brief | string | Yes | "" | High-level intent/context (validated, ~1-1000 chars). |
script | string | Conditional | "" | Spoken script. Required unless using Fabric 1 with uploaded voiceover_id. |
video_model | "kling-2.6-pro" | "kling-3-standard" | "kling-3-pro" | "fabric-1" | Yes | "kling-2.6-pro" | Generation model choice. |
single_take | boolean | No | false | Enables single-take rendering path and script length changes. |
aspect_ratio | "9:16" | "16:9" | "auto" | No | "9:16" | Output framing mode. |
creation_mode | "manual" | "video_reference" | No | "manual" | Avatar generation flow selection. |
reference_video_id | string (UUID) | Conditional | unset | Used when creation_mode="video_reference". |
reference_change_request | string | No | "" | Optional instructions when using video-reference mode. |
product_image_id | string (UUID) | No | unset | Product reference image. |
character_image_id | string (UUID) | No | unset | Avatar/character image override. |
voice_reference_id | string (UUID) | No | unset | Voice reference asset ID for cloning. |
voice_reference_type | "audio" | Conditional | unset | Set to "audio" when voice_reference_id is provided. |
voiceover_id | string (UUID) | Conditional | unset | Optional voiceover upload for Fabric 1 lip-sync path. |
voiceover_type | "audio" | Conditional | unset | Set to "audio" when voiceover_id is provided. |
elevenlabs_model_id | string | No | unset | Explicit ElevenLabs model override. |
elevenlabs_voice_settings | {stability?:number,similarity_boost?:number,style?:number,use_speaker_boost?:boolean,speed?:number} | No | unset | Fine-grained TTS tuning object. |
Parameter groups
- Core generation:
brief,script,video_model,single_take,aspect_ratio - Creation flow:
creation_mode,reference_video_id,reference_change_request - Visual references:
product_image_id,character_image_id - Voice references:
voice_reference_id,voice_reference_type,voiceover_id,voiceover_type - Single-take voice tuning:
elevenlabs_model_id,elevenlabs_voice_settings