Modify Video – Generative Video Editing Model
What is Modify Video?
Modify Video is Luma AI’s state-of-the-art generative video editing model designed for high-fidelity transformations without reshooting. By leveraging advanced pose, facial and lip-sync tracking, it maintains full-body motion consistency while enabling scene swaps, environment retexturing and style restyling. Creators and developers can generate multiple variants from a single clip, each preserving natural temporal flow and actor performance signals.
Key Features
- Full-Body & Facial Tracking: Retains original actor movements, expressions and lip-sync for seamless edits.
- Temporal Consistency: Keeps frame-to-frame continuity, minimizing flicker and jitter in output.
- Multiple Modes:
• adhere_1/adhere_2/adhere_3 – faithful, subtle edits
• flex_1/flex_2/flex_3 – balanced transformations
• reimagine_1/reimagine_2/reimagine_3 – creative overhauls - Prompt-Driven Styles: Use natural-language instructions (e.g., “make it vintage”) to guide restyling.
- First-Frame Guidance: Supply a custom first frame image URL to anchor the visual style.
- Native Resolution Support: Outputs at the same resolution as the source clip (MP4).
Best Use Cases
- Scene Replacement: Swap backgrounds or entire sets in short clips under 30 seconds.
- Creative Restyling: Transform modern footage into vintage, cinematic or genre-specific looks.
- Lip-Sync Corrections: Fix dialogue sync issues while preserving actor performance.
- Color Grading & Lighting Adjustments: Apply consistent color palettes and dynamic lighting effects.
- Director-Grade Edits: Iterate on multiple stylistic variants quickly for pre-visualization and dailies.
Prompt Tips and Output Quality
- Start Simple: Begin with clear prompts like “add neon lights” or “cinematic teal-orange grade.”
- Adjust Mode:
- For minimal adjustments, choose
adhere_1
. - For bold, artistic changes, use
reimagine_1
.
- For minimal adjustments, choose
- Leverage First Frame: Supply a stylized image via
first_frame_url
to guide overall look. - Clip Length: Keep source videos under 30 seconds for optimal processing speed.
- Iterate Variants: Run the same prompt with different modes to compare subtle vs. dramatic effects.
FAQs
Q: What video formats are supported?
MP4 clips (under 30 seconds) are recommended for reliable ingestion and native resolution outputs.
Q: How do I choose between adhere, flex, and reimagine modes?
Use adhere
modes for faithful edits, flex
for moderate stylizations, and reimagine
for full creative overhauls.
Q: Can I preserve lip-sync accuracy?
Yes—Modify Video’s facial tracking maintains lip-sync motion even when retexturing environments.
Q: Is a custom first frame required?
No. It’s optional but highly effective for guiding strong stylistic changes.
Q: How do I guide specific edits?
Set the prompt
parameter with natural-language instructions (e.g., “make it look like film noir”) to direct the AI.
Other Popular Models
sdxl-img2img
SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

faceswap-v2
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

sdxl1.0-txt2img
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

sd1.5-majicmix
The most versatile photorealistic model that blends various models to achieve the amazing realistic images.
