Everything you need to write perfect prompts for ByteDance's most powerful AI video model. From first prompt to cinematic output.
Seedance 2.0 is ByteDance's next-generation AI video model, built on a dual-branch diffusion transformer architecture. Unlike other video models that rely mostly on text prompts, Seedance 2.0 takes a "reference-first" approach — it excels when you provide visual, motion, and audio references alongside your text prompt.
Think of it like directing a film: you're not just describing what you want, you're showing examples of the look, movement, and sound you're after. The text prompt then fine-tunes the details.
Every Seedance 2.0 prompt follows the same core structure. Think of it as filling in a template, not writing a poem. The model responds best to clear, structured prompts in this exact order:
Subject + Action + Scene + Camera + Style + Constraints
Each part has a job. Let's break them down one by one.
The subject is who or what is in your video. Be specific but concise. One clear noun beats a paragraph of adjectives.
| Weak | Strong | Why |
|---|---|---|
| "A person" | "A 30-year-old woman with short black hair, wearing a linen jacket" | Specific age, features, and wardrobe prevent identity drift |
| "A car" | "A matte black 1970s muscle car with chrome bumpers" | Era, color, material, and details anchor the visual |
| "A product" | "A minimalist white ceramic mug on a dark oak workbench" | Material + color + placement = stable frame |
The action is the single movement happening in your shot. This is where most beginners go wrong — they pack in too many actions. Seedance works best with one clear action per shot.
Write actions in present tense with intensity descriptors:
"Man roaring"
"Man roaring madly, veins visible on neck"
"Car turns the corner"
"Tires smoke as the car drifts 90 degrees around the corner"
"She walks and turns and waves and smiles"
"She walks slowly toward the camera, then stops and smiles"
Timing tip: You can add duration cues like "walks for 3 seconds, stops, turns around for 2 seconds" — Seedance respects these fairly well.
Camera direction is one of Seedance 2.0's strongest features. Use standard film terminology — the model understands it natively. The golden rule: one camera move per shot. Compound movements cause chaos.
| Shot Type | Best For | Notes |
|---|---|---|
Wide shot |
Establishing scenes, landscapes | Pair with slow dolly or locked-off camera |
Medium shot |
Subject + context, dialogue | Handheld feels personal; gimbal feels polished |
Medium close-up |
Talking heads, product hero shots | Most versatile shot size |
Close-up |
Emotion, detail | Tiny push-ins work great; pans feel jarring |
Macro / Extreme close-up |
Product textures, food, skin detail | Keep camera very still |
| Movement | What It Does | Example Prompt Phrase |
|---|---|---|
Dolly in/out |
Camera physically moves toward/away from subject | "Slow dolly-in over 4 seconds" |
Pan left/right |
Camera rotates horizontally to reveal scene | "Gentle pan right revealing the cityscape" |
Tilt up/down |
Camera rotates vertically | "Slow tilt up from boots to face" |
Tracking shot |
Camera follows subject laterally | "Camera tracks alongside as she runs" |
Orbit / 360 |
Camera circles around subject | "Camera orbits the statue at eye level" |
Handheld |
Slight natural sway, authentic feel | "Handheld, slight sway, eye level" |
Gimbal |
Smooth, stabilized movement | "Gimbal-smooth follow shot from behind" |
Locked-off / Tripod |
Completely static camera | "Locked-off tripod, no camera movement" |
Crane shot |
Vertical camera lift | "Crane rises to reveal the valley below" |
Aerial / Drone |
Overhead or high-angle perspective | "Aerial shot descending toward the subject" |
Pick one strong style anchor rather than stacking adjectives. The model handles a single clear direction much better than five competing aesthetics.
Add these at the end of your prompt to maximize output quality:
Unlike Stable Diffusion or Midjourney, Seedance 2.0 does not support negative prompts. Instead, you use a "constraints" section at the end of your prompt — a short ban list of things you don't want.
| Category | Constraint Examples |
|---|---|
| Visual artifacts | "No text overlays, no watermarks, no lens flares" |
| Identity issues | "No extra characters, no face morphing, no crowds" |
| Camera problems | "No snap zooms, no whip pans, no Dutch angles" |
| Body artifacts | "No extra fingers, no deformed hands" |
| Branding | "No logos, no labels, no recognizable brands" |
| Environmental | "No rain, no fog, no dust particles" |
"A dynamic, cinematic, powerful video of a man"
"A 40-year-old man in a navy suit walks toward the camera down an empty hallway. Medium shot, slow dolly-in, overcast natural light."
"The car moves energetically around the track"
"Tires smoke as the car drifts 90 degrees, gravel sprays from the rear wheels"
"She walks in, sits down, picks up the phone, talks, then hangs up and stands"
"She sits at the desk and slowly picks up the ringing phone. Close-up on her hand, then medium shot of her face as she answers."
"Energetic camera movement"
"Handheld tracking shot with slight sway, eye level"
This is Seedance 2.0's defining feature. When you upload reference files (images, videos, audio), the platform assigns labels like @Image1, @Video1, @Audio1. You reference them directly in your prompt to tell the model exactly what role each file plays.
| Reference Type | Use It For | Example |
|---|---|---|
@Image1 |
Character appearance, first/last frame, style anchor | "@Image1 as the main character's face and outfit" |
@Image2 |
Setting, environment, second character | "Use @Image2 as the background environment" |
@Video1 |
Camera movement, motion pattern, choreography | "Replicate @Video1's camera movements exactly" |
@Audio1 |
Music timing, beat alignment, rhythm | "Cut transitions on @Audio1's beat drops" |
Seedance 2.0 generates audio simultaneously with the video — it's not post-processing. You can trigger specific sound types by using descriptive audio keywords in your prompt.
| Keyword | Effect | Good For |
|---|---|---|
reverb |
Spatial acoustics, large halls | Cathedral scenes, concert halls |
muffled |
Underwater or enclosed spaces | Underwater shots, phone conversations |
echoing |
Large reverberant spaces | Caves, empty warehouses |
crunchy |
Rough texture sounds | Gravel paths, snow walking |
metallic clink |
Sharp precise metal sounds | Sword fights, keys, tools |
How to trigger sound: Simply describe the sound naturally in your prompt. For example: "Sound of loud, aggressive sizzling" or "Sound of high-RPM engine and splashing water". The model's audio branch picks up on these cues and generates matching audio.
Face or identity keeps changing between frames
Fix: Add "keep same face and clothing throughout" explicitly. Use a clean, frontal, high-res reference image. Shorten the clip to 4–6 seconds. Avoid busy backgrounds that confuse the model.
Jittery or rubbery motion
Fix: Simplify to ONE movement per shot. Specify "locked-off" or "slow smooth dolly-in". Use a 2–4 second reference clip with a single clear motion.
Distorted or impossible hands
Fix: Keep hands larger in frame (close-up rather than wide). Avoid fast finger actions. Reduce motion speed. This is still a known limitation across all video models.
Camera ignoring my instructions
Fix: Put camera direction on its own line/sentence. Use standard film language ("slow dolly-in" not "the camera energetically moves forward"). Make sure "unfixed camera" is selected in platform settings if using camera movement.
Text or logos getting distorted
Fix: Make text/logos bigger and centered in frame. Add "text remains sharp and readable" to constraints. Reduce camera motion near text elements.
Color drift in extended videos
Fix: Repeat your style and color constraints in EVERY extension prompt, even if it feels redundant. Quality holds well through 2–3 extensions; the 4th typically shows noticeable drift.
Style or color drifts from what I described
Fix: Replace your style line with a single strong anchor keyword (e.g., just "cinematic realism" instead of "a beautiful cinematic warm dreamy ethereal look"). Remove competing adjectives.
| Problem | What to Change |
|---|---|
| Framing wrong, action is right | Retighten only the Camera line |
| Motion feels off | Swap handheld ↔ gimbal; set explicit speed |
| Style or color drifts | Replace Style line with single strong anchor; remove extra adjectives |
| Subject mutates mid-clip | Simplify Subject to one noun + one descriptor |
| Same artifact appears 3+ tries | Change the shot plan or constraints entirely rather than stacking bans |
Replace the bracketed sections with your specifics. These are ready to paste into Seedance 2.0.
Upload any AI video and our tool will reverse-engineer it into a clean, copy-paste-ready Seedance 2.0 prompt — complete with scene breakdown, camera direction, and style notes.
Try the Video to Prompt Generator