Minimax Hailou 2

Generate breathtaking 1080P cinematic videos from text or images with ultra-realistic motion and physics.

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/minimax-hailuo-2"

# Request payload
data = {
  "prompt": "Man wandered into a desert canyon with a lion at his side.",
  "prompt_optimizer": True,
  "image_url": "https://segmind-resources.s3.amazonaws.com/input/db787c04-b073-4e08-ba52-6cee341c7310-7f1198b9-2961-4dc1-ad67-e9c60013b44a.jpeg",
  "mode": "standard"
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

promptstr *

Define video content details for clarity or abstractness. Use nautical themes for oceanic scenes or fictional for creative vibes.

prompt_optimizerboolean ( default: true )

Enhance prompt for video quality. Set to true for optimized storytelling or false to keep original tone.

image_urlstr ( default: https://segmind-resources.s3.amazonaws.com/input/db787c04-b073-4e08-ba52-6cee341c7310-7f1198b9-2961-4dc1-ad67-e9c60013b44a.jpeg )

Provide image URL for relighting. Use high-resolution images to enhance detail or standard for faster processing.

modeenum:str *

Select video generation mode. Choose 'standard' for efficiency or 'pro' for advanced features when available.

Allowed values:

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Hailuo 02 – AI Text-to-Video & Image-to-Video Model

What is Hailuo 02?

Hailuo 02, developed by MiniMax, is a state-of-the-art generative AI video model designed for developers, creators, and product managers. Ranked #2 globally on the Artificial Analysis benchmark, it produces professional-grade, 1080P cinematic videos up to 10 seconds long at smooth 24–30 FPS. Leveraging advanced physics simulation, facial recognition, and body tracking, Hailuo 02 delivers ultra-realistic motion, fluid dynamics, and consistent character performance—ideal for marketing, filmmaking, education, and social media.

Key Features

1080P Cinematic Quality: Generate short videos in full HD with film-style clarity.
24–30 FPS Smooth Playback: Maintain natural motion even in fast-paced scenes.
Ultra-Realistic Physics: Simulate fluid dynamics, gravity, and complex object interactions (e.g., acrobatics, water splashes) with photoreal accuracy.
Character Consistency: Advanced facial recognition and body tracking ensure actor likeness and posture remain coherent across frames.
Dual Input Modes:
- Text-to-Video: Convert detailed prompts (e.g., “In an underwater city, bioluminescent seahorses swim by colorful coral skyscrapers…”) into vivid video narratives.
- Image-to-Video: Provide an image URL for relighting, scene extension, or dynamic camera panning.
Fast Inference: Optimized for ‘standard’ and ‘pro’ modes—balance speed and advanced features based on project needs.
Prompt Optimizer (advanced): Toggle enhancement to refine storytelling and scene coherence.

Best Use Cases

Film & Advertising: Create short cinematic teasers, product showcases, and dynamic trailers.
Social Media Content: Produce eye-catching 10-second clips for Instagram, TikTok, and YouTube Shorts.
E-Learning & Education: Simulate scientific experiments, historical reenactments, or language immersion scenes with accurate physics and consistent narration.
Concept Art & Pitch Decks: Visualize prototypes, architectural fly-throughs, or game cinematics before committing to full-scale production.

Prompt Tips and Output Quality

Define Scene Elements Clearly
- Use adjectives like “high-contrast,” “soft lighting,” or “underwater bioluminescence” to guide mood.
Set Frame Rate Expectations
- Default outputs at 24 FPS; specify 30 FPS in your prompt for smoother motion.
Leverage Image URLs
- Supply a high-resolution image to refine lighting and textures in image-to-video tasks.
Toggle Prompt Optimizer
- Enable for richer detail and narrative cohesion; disable to preserve your original tone.

FAQs

Q: What’s the maximum video length?
A: Up to 10 seconds per generation.

Q: Can I switch between standard and pro modes?
A: Yes. Use mode: "standard" for faster renders and mode: "pro" for advanced physics and detail.

Q: How do I ensure consistent character appearances?
A: Include descriptive facial and body attributes in your prompt; Hailuo 02’s body tracking will maintain consistency.

Q: Does Hailuo 02 support fluid simulations?
A: Absolutely. It excels at water, smoke, and particle effects with real-world physics.

Q: What resolution does Hailuo 02 output?
A: Full HD 1080P by default; specify resolution in your API call if custom sizing is required.

Other Popular Models

sdxl-controlnet

SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept of conditioning inputs, which provide additional information to guide the image generation process

faceswap-v2

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

sdxl-inpaint

This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

sd2.1-faceswapper