Using Video to Prompt for Midjourney: Tips & Tricks

Midjourney is the AI image generator of choice for millions of creative professionals, photographers, and artists. Its combination of beautiful default aesthetics and powerful parameter system makes it uniquely suited to video-extracted prompts. When you analyze a video with VideoToPrompt.org and then paste that prompt into Midjourney with the right parameters, the results can be extraordinary. This guide covers everything you need to know about using video-extracted prompts specifically with Midjourney v6.

Midjourney Quick Facts: All commands are entered in Discord via slash commands. You need an active Midjourney subscription to generate images. The default model as of 2025 is Midjourney v6.1, which has significantly improved prompt understanding compared to v5.

Midjourney v6 Prompt Structure

Midjourney v6 represents a major leap in natural language understanding. Unlike earlier versions that preferred comma-separated keywords, v6 interprets conversational, descriptive sentences with nuanced accuracy. This is excellent news for video-extracted prompts, which tend to be written in rich descriptive language.

Anatomy of a v6 Prompt

A well-structured Midjourney v6 prompt has three parts: the content description, the style qualifiers, and the parameters.

Format: [Content description]. [Style qualifiers and mood]. [Technical aesthetic details] --[parameters]

Example: A woman stands at the edge of a cliff overlooking a misty valley at dawn. Cinematic lighting with soft pink and gold tones, shot from below looking upward. Epic scale, atmospheric haze, photorealistic --ar 16:9 --v 6 --style raw --q 2

v6 vs v5 Prompt Differences

If you've been using Midjourney for a while, understand these key changes in v6:

v6 understands complex sentences and relationships between elements
Parenthetical emphasis like (keyword) is no longer needed or recommended
You must explicitly request stylistic qualities rather than relying on Midjourney's defaults
Negation words like "without" and "no" actually work in v6, unlike previous versions
Artist and filmmaker references are interpreted more accurately

Essential Parameter Flags

Parameters are appended to the end of your prompt and control specific technical aspects of generation. Understanding which parameters to apply to your video-extracted prompts is the difference between good and great results.

--ar (Aspect Ratio)

The aspect ratio parameter should always match the format of your source video. This is arguably the most important parameter for video-extracted prompts.

--ar 16:9 — Standard widescreen video (YouTube, film)
--ar 9:16 — Vertical video (TikTok, Instagram Reels, Shorts)
--ar 4:3 — Classic TV/film format
--ar 21:9 — Cinematic ultrawide (anamorphic film)
--ar 1:1 — Square format (Instagram posts)
--ar 3:2 — Standard photography ratio

--v and --quality

--v 6 — Use Midjourney v6 (recommended for all video-extracted prompts)
--q 0.5 — Half quality, faster generation for testing prompts
--q 1 — Standard quality (default)
--q 2 — Double quality, slower but more detail — use for final renders

--style and --chaos

These parameters control the aesthetic interpretation of your prompt:

--style raw — Less opinionated, closer to the literal prompt — best for photorealistic video sources
--style cute — Softer, cuter aesthetic
--style expressive — More artistic interpretation
--chaos 0 — Consistent, predictable results (default)
--chaos 50 — Moderate variety across the 4-image grid
--chaos 100 — Maximum variation, useful for exploring visual possibilities from a single video prompt

--weird and --stylize

--stylize 100 (default) — Balanced between prompt accuracy and Midjourney's aesthetic
--stylize 50 — Closer to the prompt, less stylistic interpretation
--stylize 750 — High stylization, more beautiful but may drift from prompt
--weird 250 — Adds unusual visual elements for creative experimentation

The /describe Command vs Video Extraction

Midjourney has its own /describe command that generates prompts from images. Understanding how it compares to VideoToPrompt.org's video extraction helps you choose the right tool.

Feature	Midjourney /describe	VideoToPrompt.org
Input type	Single image	Video (multiple frames)
Temporal analysis	No	Yes — captures motion and continuity
Output count	4 prompt variants	Single comprehensive prompt
Style specificity	Midjourney-optimized	Platform-agnostic, adaptable
Technical settings	No	Yes — includes lighting, camera, mood analysis
Best use	Quick image replication	Video style extraction and reuse

Pro Tip: Use VideoToPrompt.org to extract the core visual description from your video, then use Midjourney's /describe on a specific frame to get Midjourney-flavored vocabulary for the same scene. Combining both gives you the best of both approaches.

Multi-Prompt Syntax

Midjourney's multi-prompt syntax lets you specify the relative importance of different elements in your video-extracted prompt. This is especially powerful when your video has a clear subject and background with distinct styles.

Use the double colon :: separator with optional numeric weights:

Multi-prompt example: ancient temple ruins ::2 covered in dense jungle vegetation ::1 golden sunlight filtering through canopy ::1.5 --ar 16:9 --v 6

This makes the temple ruins (::2) the dominant element while balancing vegetation and lighting.

Style Reference and Character Reference

Two of Midjourney's most powerful features for video-extracted prompt workflows are --sref (style reference) and --cref (character reference), introduced in 2024.

--sref (Style Reference)

Upload a video frame as a style reference image and Midjourney will capture the visual aesthetic of that frame while applying your text prompt's content:

Usage: /imagine [your extracted prompt text] --sref [image URL] --sref [second image URL] --sw 100

--sw (style weight) controls how strongly the reference style is applied, from 0-1000. Default is 100.

This workflow is exceptionally powerful: extract a frame from your video, upload it, and use it as a --sref while applying your extracted text prompt. The result combines the semantic description with the exact visual style of your source material.

--cref (Character Reference)

For video footage that features a specific person or character, --cref lets you maintain that character's appearance across multiple generations:

Usage: /imagine [character in new scene description] --cref [character image URL] --cw 100

--cw (character weight) from 0-100 controls how strictly the character appearance is maintained.

Niji Mode for Anime-Style Video Sources

If your source video has an anime, illustrated, or manga aesthetic, use Niji mode instead of the standard Midjourney model. Add --niji 6 to your prompt.

--niji 6 — Excellent anime aesthetics with strong prompt understanding
--style cute with niji — Kawaii, soft character styles
--style scenic with niji — Studio Ghibli-inspired landscapes
--style expressive with niji — Dynamic action and dramatic scenes

Example Prompts by Video Genre

Here are complete, ready-to-use Midjourney prompts extracted from different video genres:

Cinematic Film Style

A solitary figure walks through a rain-soaked alley at night, neon signs reflected in puddles, steam rising from grates, long shadows stretching across wet pavement. Film noir aesthetic, cinematic color grading with deep blues and crimson accents, anamorphic lens flare, documentary realism. Roger Deakins cinematography style --ar 21:9 --v 6 --style raw --q 2

Nature Documentary Style

Aerial view of an ancient forest at golden hour, mist rolling through valleys between towering redwood trees, shafts of warm light cutting through the canopy, birds in silhouette against a gradient sky. BBC Earth nature documentary photography, ultra-high definition, breathtaking scale --ar 16:9 --v 6 --q 2 --stylize 200

Urban Street Style

Busy Tokyo intersection at dusk, hundreds of pedestrians crossing under neon advertisements, motion blur on the crowd, a lone street photographer in focus mid-frame. Vivid urban colors, hyperreal photography, shallow depth of field, shot with a 50mm lens at f/1.8 --ar 4:3 --v 6 --style raw

Portrait Video Style

Close-up portrait of an elderly fisherman, deeply weathered face, laugh lines, piercing blue eyes that reflect the sea, wearing a worn yellow slicker. Golden hour backlight, subtle bokeh, intimate documentary photography, shot on medium format film --ar 4:5 --v 6 --style raw --q 2

Advanced Tips and Common Mistakes

Pro Tips for Video-Extracted Prompts in Midjourney

Start with --style raw: For video-extracted prompts, raw mode hews closer to your description without Midjourney's default beautification
Use Vary Region: After generating, use the Vary (Region) feature to fix specific areas while keeping the rest — invaluable for correcting faces or backgrounds
Remix mode: Enable Remix mode to make parameter or prompt adjustments when upscaling
Use --no for unwanted elements: If the video prompt describes something you don't want reproduced, add --no [element]
Seed locking: Use --seed [number] to reproduce consistent results across variations of the same extracted prompt

Common Mistakes to Avoid

Pasting excessively long prompts without testing shorter versions first
Forgetting to set --ar to match your video's aspect ratio
Using v5-style keyword stuffing in v6, which can confuse the model
Ignoring the --style raw option for photorealistic video sources
Not using --sref when you have a reference frame available

Midjourney's strength is its combination of artistic sensibility with powerful parameter control. Video-extracted prompts give you the rich descriptive foundation, and these parameters give you the precision control. Together, they enable replication and re-imagination of real-world video aesthetics at a professional level.