Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mosaic.so/llms.txt

Use this file to discover all available pages before exploring further.

Audio Generation creates new audio assets. Use voiceover mode for spoken narration and music mode for background tracks.

How It Works

For voiceover, provide a script and optionally attach a voice reference. For music, describe the mood, genre, instrumentation, and target length. The output can feed video-generation, AI Avatar, captions, or publishing workflows.

API Usage

Create an agent shell:
curl -X POST "https://api.mosaic.so/agent/create" \
  -H "Authorization: Bearer mk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Launch voiceover",
    "visibility": "private"
  }'
Add an Audio Generation node:
curl -X POST "https://api.mosaic.so/agent/AGENT_ID/update" \
  -H "Authorization: Bearer mk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "operations": [
      {
        "op": "create_node",
        "node_type_id": "14687f30-5fd0-468f-8239-2784d83df95b",
        "params_used": {
          "mode": "voiceover",
          "model": "eleven_v3",
          "script": "Introducing the fastest way to turn ideas into publish-ready videos.",
          "use_upstream_voice_reference": false
        }
      }
    ]
  }'
Run the agent with no video inputs:
curl -X POST "https://api.mosaic.so/agent/AGENT_ID/run" \
  -H "Authorization: Bearer mk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "callback_url": "https://your-webhook.com/mosaic"
  }'

API Info

  • Node ID: 14687f30-5fd0-468f-8239-2784d83df95b

Node params

ParamTypeRequiredDefaultNotes
mode"voiceover" | "music"No"voiceover"Audio generation mode.
model"eleven_v3" | "music"No"eleven_v3"Audio model for the selected mode.
scriptstringConditional""Required for voiceover mode.
voice_idstringNocurated defaultVoice ID when no upstream voice reference is used.
use_upstream_voice_referencebooleanNotrueUse selected upstream audio/video as a voice reference.
selected_context{audio:string[],videos:string[],voice_reference?:string}Noempty listsAttached references.
music_promptstringConditional""Required for music mode.
music_length_msnumberNo30000Music duration, 5-300 seconds.

Voiceover example

{
  "mode": "voiceover",
  "model": "eleven_v3",
  "script": "This is the voiceover text.",
  "use_upstream_voice_reference": false
}

Music example

{
  "mode": "music",
  "model": "music",
  "music_prompt": "Warm upbeat electronic background music for a product launch",
  "music_length_ms": 45000
}