Weak prompt
Make an audio story.
Vague. No scene. No characters. No mood.
Prompt lab
A great prompt is not a sentence — it is a creative brief for sound. The more clearly you describe the scene, voices, dialogue, music, sound effects and emotion, the easier it is to generate audio that feels polished, cinematic and ready to use. This guide shows you the exact structure, formulas and prompt examples that consistently produce broadcast-ready output from Seed Audio 1.0.
Audio placeholder: #1 — Hero opening demo
Type: Comprehensive cinematic showcase (radio drama opening)
Duration: ~20 seconds
Display: Inline audio player with waveform visualization
Caption shown above the player: "Generated from the full prompt below — one pass, no post-production."
Caption shown below the player: "Click to expand the prompt that created this audio ↓"
Prompt used (collapsible accordion under the player): "Create a cinematic radio drama scene opening. Setting: a stormy night inside an old coastal lighthouse. Heavy rain hits the windows, distant thunder rolls over the ocean, and the lighthouse lamp rotates with a low mechanical hum. Music: subtle cinematic score with deep strings, soft piano, low ambient drones. Mood: mysterious, emotional, suspenseful. Narrator: clear audiobook narration, calm but tense: 'On the night the lighthouse went dark, Clara found the letter her father had hidden for twenty years.' End with rising strings and a single foghorn."
Prompt system
Create a [type of audio] for [scenario]. Setting: [place + atmosphere]. Mood: [emotion]. Music: [style + instruments + intensity]. SFX: [key sounds]. Characters: [roles + performance direction]. Dialogue: [natural lines]. Ending: [final sound or emotional beat].
Structure
9 prompt elements
References
@Audio 1, @Audio 2...
Examples
6 ready-to-use prompts
Output
cinematic audio
Most audio AI tools treat your prompt as a single line of text that needs to be voiced. Seed Audio 1.0 is different — it reads your prompt the way a film director reads a treatment. That means your prompt should describe a scene, not just a sentence.
A strong Seed Audio 1.0 prompt has nine layered elements that work together to give the model creative direction without overloading it: audio type, setting, mood, music, sound effects, characters, dialogue, pacing and ending. Master these nine elements and you can move from "AI voice generator" output to genuinely broadcast-grade productions.
Make an audio story.
Vague. No scene. No characters. No mood.
Create a cinematic radio drama scene for a serialized audiobook. Setting: a stormy night inside an old coastal lighthouse. Music: deep strings, soft piano, low ambient drones. Narrator, calm but tense: "On the night the lighthouse went dark, Clara found the letter her father had hidden for twenty years." End with rising strings and a foghorn.
Specific scene. Layered direction. Cinematic outcome.
Whenever you don't know where to start, use this formula. It works for every audio type — radio drama, ad, podcast, video dubbing, voice companion, game audio.
Create a [type of audio] for [scenario]. Setting: [place + atmosphere]. Mood: [emotion]. Music: [style + instruments + intensity]. SFX: [key sounds]. Characters: [roles + performance direction]. Dialogue: [natural lines]. Ending: [final sound or emotional beat].
That's it. Fill each line. Skip what you don't need. The model handles timing, transitions and mixing automatically.
Audio placeholder: #2 — Formula in action
Type: A short 60-second audio generated by literally filling in the formula above
Duration: ~60 seconds
Display: Inline player with the filled-in formula shown line-by-line next to the player
Caption: "Here’s the same formula filled in for a 30-second coffee brand ad — listen below."
Below is the full structure that consistently produces high-quality output. You don't need to use every element — but knowing what each one controls makes your prompts dramatically better.
Start by stating exactly what kind of audio you want. Common Seed Audio 1.0 audio formats:
The setting defines the listener's space. Include location, time of day, weather, room tone, background ambience, distance and movement.
Audio placeholder: #3 — Setting Before/After A
Type: 15-second clip generated from the WEAK prompt above
Duration: ~15 seconds
Display: Left side of a two-column Before/After audio comparison component
Caption: "Weak setting prompt — flat, unspecific"
Audio placeholder: #4 — Setting Before/After B
Type: 15-second clip generated from the STRONG prompt above
Duration: ~15 seconds
Display: Right side of a two-column Before/After audio comparison component
Caption: "Strong setting prompt — spatial, cinematic, specific"
Mood tells Seed Audio 1.0 how the whole scene should feel, and how that feeling should change.
Mood: calm and intimate at first, then gradually tense and dangerous as the hidden door opens.
Music works best when you describe style, instruments, intensity and when it should rise or fall.
Music: subtle cinematic score with deep strings, soft piano and low ambient drones. Keep it quiet under dialogue, then rise before the final reveal.
Audio placeholder: #5 — Music direction Before/After A
Type: Weak music prompt
Duration: ~15 seconds
Caption: "Weak: Add music"
Audio placeholder: #6 — Music direction Before/After B
Type: Strong music prompt
Duration: ~15 seconds
Caption: "Strong: cinematic score with clear instruments and timing"
Sound effects should support narrative beats. Place them where they happen instead of listing them randomly.
SFX: rain on glass throughout. Thunder crack as Clara opens the letter. Foghorn after Elias says the final line.
Define characters by role, personality and performance direction. Do not ask for exact imitation of real people.
Narrator: clear audiobook narration, calm but tense. Detective Ray: tired, sharp, low voice. Maya: whispering, afraid, trying to stay composed.
Natural spoken dialogue beats stiff written dialogue. Use contractions, short sentences, pauses and emotional beats.
Audio placeholder: #7 — Dialogue Before/After A
Type: Weak dialogue prompt
Duration: ~15 seconds
Caption: "Weak dialogue — stiff and written"
Audio placeholder: #8 — Dialogue Before/After B
Type: Strong dialogue prompt
Duration: ~15 seconds
Caption: "Strong dialogue — spoken and natural"
Pacing controls how quickly the scene moves. Tell the model when to pause, when music should rise, and when a beat should land.
Pacing: slow and tense for the first 10 seconds, then dialogue becomes urgent. Pause before the final reveal.
A strong ending tells the model how to close the audio emotionally and sonically.
Seed Audio 1.0 supports multiple reference audios in a single prompt — so you can assign different cloned voices to different speakers in one generation. This is what turns a single-narrator output into a full-cast radio drama, podcast or commercial.
When writing your Seed Audio 1.0 prompt, clearly state which character should use which reference audio. The recommended syntax:
Character Name: [role, personality, speaking style], performed by @Audio 1: "Dialogue line here."
Host A, performed by @Audio 1, speaks in a calm and curious tone: "Today we're talking about whether household robots would actually make life better." Host B, performed by @Audio 2, replies playfully: "Helpful? Sure. But I don't need a robot judging my midnight snacks."
Once you bind a reference audio to a role, do not switch it mid-prompt unless the story requires a clear character transformation (e.g. possession, disguise, flashback). Switching the same character between @Audio 1 and @Audio 2 will produce inconsistent voice identity.
Reference audio controls who is speaking. Prompt text controls what happens around the voice: music, ambience, room tone, SFX and emotional direction.
Create a [audio format] with multiple speakers. Use @Audio 1 as [Character A]. Use @Audio 2 as [Character B]. Use @Audio 3 as [Narrator]. [Character A], performed by @Audio 1, [emotion]: "[dialogue]" [Character B], performed by @Audio 2, [emotion]: "[dialogue]"
Audio placeholder: #9 — Multi-reference demonstration
Type: Multi-character scene with 3 reference voices
Duration: ~45 seconds
Display: Inline player below the multi-reference prompt template
Each example below is a full Seed Audio 1.0 prompt you can copy, paste and modify. Listen to the real generated output, then adapt the structure to your own scene.
Create a cinematic radio drama scene for a serialized audiobook. Setting: a stormy night inside an old coastal lighthouse. Heavy rain hits the windows, distant thunder rolls over the ocean, and the lighthouse lamp rotates with a low mechanical hum. Music: subtle cinematic score with deep strings, soft piano, low ambient drones, and light percussion. Mood: mysterious, emotional, and suspenseful. Narrator: clear audiobook narration, calm but tense: "On the night the lighthouse went dark, Clara found the letter her father had hidden for twenty years." SFX: paper envelope opening, wind pushing against a wooden door. Clara, anxious but determined: "This can't be real. He said the island was abandoned." Elias, quiet and protective: "Your father lied to keep you alive. Some stories are buried for a reason." SFX: thunder crack, glass rattling, distant foghorn. Clara: "Then tell me the truth. What's under the lighthouse?" Elias pauses. Music drops lower. Elias: "Not under it. Inside it." SFX: metal gears turning, hidden stone door opening, deep underground air rushing out. End with rising strings, ocean waves, fading rain, and one final lighthouse bell.
Audio placeholder: #10 — Radio Drama example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Create a polished 30-second audio ad for Golden Hour Coffee. Setting: early morning in a bright city apartment. Mood: fresh, optimistic, premium. Music: warm upbeat with soft guitar, light piano, subtle percussion. SFX: coffee beans pouring, espresso machine steaming, ceramic cup on counter. Narrator, confident commercial voice: "Every morning starts with a choice. Rush through the day, or take one perfect moment for yourself." Ending: soft chime, gentle bass hit, fading coffee shop ambience.
Audio placeholder: #11 — Advertising example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Create a 60-second podcast segment with two hosts discussing whether household robots will make life better. Use natural reactions, light laughs, warm intro music, subtle room tone and clear turn-taking between Host A and Host B.
Audio placeholder: #12 — Podcast example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Create a sci-fi rescue vehicle dubbing track. Setting: damaged spacecraft corridor with alarms, steam vents and radio static. Two characters speak urgently while trying to restore power before oxygen runs out. Music: tense synth drones rising under dialogue.
Audio placeholder: #13 — Video Dubbing example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Create an evening AI companion conversation. Setting: quiet apartment at night, soft rain outside, warm room tone. Mood: supportive, calm, reflective. Character speaks gently, asks one thoughtful question, and ends with a reassuring line.
Audio placeholder: #14 — Voice Companion example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Create a fantasy forest ruin exploration soundscape. Setting: ancient stone archway in a moonlit forest. SFX: distant owls, leaves, footsteps, magic shimmer, low creature rumble. Music: subtle orchestral pads and hand percussion. End with a hidden door opening.
Audio placeholder: #15 — Game Soundscape example
Display: Inline player below the prompt code block
CTA: Copy this prompt
Weak: Make a cool audio.
Better: Create a tense sci-fi radio drama scene inside a damaged spacecraft. Add low synth drones, warning alarms, radio static, and two characters speaking urgently as they try to restore power before oxygen runs out.
Weak: Add beeps, then beeps, then more beeps, then another beep.
Better: Add a short sequence of UI beeps, screen taps, and data-loading sounds.
Weak: Make the voice sound exactly like [real person].
Better: Use a fictional public-speaker style: confident, energetic, and theatrical. Do not imitate any real voice exactly.
Weak: @Audio 1 says something, then @Audio 2 talks, then the narrator speaks.
Better: Host A, performed by @Audio 1: "…" Host B, performed by @Audio 2: "…" Narrator, performed by @Audio 3: "…"
Weak: Host A, performed by @Audio 1, cheerful, bright, friendly, warm, casual, energetic, natural, realistic, speaks…
Better: Host A, performed by @Audio 1, cheerful and natural.
The best Seed Audio 1.0 prompt follows a 9-element structure: audio type, setting, mood, music, sound effects, characters, dialogue, pacing and ending. Filling in each layer gives the model a complete creative brief instead of a single sentence.
For a short demo, 800–1,500 characters is ideal. For a detailed scene, 1,500–2,000 characters works best. Avoid going over the model's character limit — prompts longer than that often produce worse results, not better.
Use the 9-element structure plus character role descriptions. Define each character by function and emotion (not by celebrity name), write natural spoken dialogue, layer in SFX at key moments, and direct the music to rise and fall around the dialogue. See Example 1 above for a complete radio drama prompt template.
Yes. Seed Audio 1.0 supports multiple reference audios in one prompt. Use the syntax: `Character Name, performed by @Audio 1, [emotion]: "dialogue line"`. Keep each character’s reference audio consistent throughout the prompt — do not switch the same role between different @Audio IDs unless the story requires a clear transformation.
You can assign up to 8 distinct reference audios in a Pro-plan generation (more on Business). Each one is bound to a named character role. The model then performs all characters together inside one generated scene.
A weak prompt gives only a vague instruction. A strong prompt defines audio type, setting, mood, music, sound effects, characters, dialogue, pacing and ending. Strong prompts sound more cinematic because the model has enough creative direction to arrange the scene.
Yes. Emotion words guide both vocal performance and music direction. Use emotional arcs when possible: calm then tense, playful then sincere, intimate then dramatic.
Usually the dialogue is too written, the character role is unclear, or the prompt lacks emotion and pacing. Rewrite lines in natural speech, add pauses or reactions, and use reference audio if you need a specific voice identity.
Avoid generic prompts, repeated effects, exact imitation requests, unclear reference assignments and overloaded adjectives. Keep each instruction specific and useful.
Yes. You can copy and adapt the prompts in this guide. Commercial rights for generated output depend on your Seed Audio 1.0 plan. Paid plans include commercial usage rights.
You've got the structure, the formula, the templates and the examples. Open Seed Audio 1.0 and put one of these prompts to work.