
How It Works
The Captions tile listens to your video audio, transcribes speech, and overlays timed captions. You control how they look — from fonts and stroke to placement and coloring. Captions are auto-timed to speech and update in real-time based on your styling choices.
Input and Settings
Caption Style
Controls how captions are visually rendered. Common styles:- Colored Words → Highlights key words for engagement
- Stroke Text → Outlined text for readability
- Full Highlight → Blocks/background for high contrast
- Retention editing
- Educational clips
- Talking-head explanatory content
Colors
You can customize three visual layers: Base ColorDefault text color (e.g.,
#FFFFFF for white)
Highlight ColorUsed to emphasize specific words — increases engagement & readability Stroke Color
Outline around text for contrast on busy footage
Font Options
Set the typography style to match your brand. Controls include:- Font Family (e.g., Montserrat)
- Font Weight (e.g., 400 / 700)
- Font Size (slider)
- Bold fonts for TikTok/Shorts
- Light/fonts for cinematic edits
Vertical Position
Adjust how high/low captions sit in the frame (via percentage slider). Examples:90%→ Just above bottom edge (common for shorts)50%→ Centered (cinematic)20%→ Top aligned (when lower third is busy)
Words per Caption
Controls pacing and readability. Two sliders:- Minimum words
- Maximum words
Min 1 / Max 3→ Fast TikTok pacingMin 3 / Max 7→ YouTube educational pacing