Skip to main content
Modify spoken audio in your video using AI — including dubbing into different languages, swapping specific words, and generating lip-synced speech. This is valuable for repurposing videos globally, correcting misspoken words, or adjusting brand/script terminology without re-recording.
How It Works
The Voice tile analyzes your video’s speech, generates replacement audio based on your selections, and merges it back into the timeline. If enabled, it also adjusts lip movements to match the new audio.
Target Language
Choose the output language for the voice.
Use cases:
- Global distribution (English ➜ Spanish ➜ French, etc.)
- Multilingual short-form content
- Translating educational/tutorial content
Preserve Background Audio
Choose whether to keep original ambient sounds or music.
- Yes: Keeps natural room tone, gameplay audio, background music
- No: Outputs clean isolated speech for controlled mixing
Lip Sync
Aligns mouth movements to match the new speech.
Use when:
- Dubbing to another language
- Replacing sections of spoken script
- Creating seamless speech edits
Safewords (Limited provider support)
Enter words you do not want replaced or modified (e.g., brand names, product names, proper nouns).
Examples:
Mosaic
OpenAI
iPhone
Coffee
Useful for:
- Protecting brand accuracy
- Preventing mistranslations
- Maintaining product terminology
Translation Dictionary (Limited provider support)
Define custom word/phrase replacements.
Example:
| Original | New |
|---|
| ”AI editor" | "AI assistant" |
| "coffee" | "latte” |
Use cases:
- Localizing slang
- Controlling brand wording
- Fine-tuning translations