Skip to main content
Modify spoken audio in your video using AI — including dubbing into different languages, swapping specific words, and generating lip-synced speech. This is valuable for repurposing videos globally, correcting misspoken words, or adjusting terminology without re-recording.

How It Works

The Voice tile analyzes your video’s speech, generates replacement audio based on your selections, and merges it back into the timeline. If enabled, it also adjusts lip movements to match the new audio. Voice UI

Input and Settings

Target Language

Choose the output language for the voice. Use cases:
  • Global distribution (English → Spanish → French, etc.)
  • Multilingual short-form content
  • Translating educational or tutorial content

Preserve Background Audio

Choose whether to keep original ambient sounds or music.
  • Yes — keeps natural room tone, gameplay audio, background music
  • No — outputs clean isolated speech for controlled mixing

Lip Sync

Aligns mouth movements to match the new speech. Essential when dubbing to another language, replacing script sections, or creating seamless speech edits.
Advanced features (limited provider support): The Voice tile also supports Safewords (words that should never be modified, like brand names) and a Translation Dictionary (custom word/phrase replacements for fine-tuned localization). Availability depends on your selected voice provider.
Voice pairs well with Captions (add subtitles in the new language), Reframe (adapt for different platforms), and Audio Enhance (polish the final audio).