Vidu Q1 Reference to Video

Vidu AI reference to video transforms text and images into dynamic, high-quality videos effortlessly.

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/vidu-q1-reference-to-video"

# Request payload
data = {
  "prompt": "A middle-aged man sits on a wooden bench in a sunny park, reading an open hardcover book titled 'Vidu Q1 Reference'. The camera gently zooms in on the book’s cover and pages showing technical content, then back to the man turning pages thoughtfully. Soft sunlight filters through the trees, blending the three reference images naturally into one smooth scene.",
  "image_urls": [
    "https://segmind-resources.s3.amazonaws.com/input/62d98e27-8538-4953-95e6-e93d4a544fc1-39a80c5a-c4c4-405f-88f4-30fd1a45d6c9.jpeg",
    "https://segmind-resources.s3.amazonaws.com/input/ac3266d0-271e-4e31-96a9-d31f26d8c22f-a052d794-4300-478d-8d99-5a10daf5200c.png",
    "https://segmind-resources.s3.amazonaws.com/input/777c247c-8073-4d47-ac64-53b5fa088aae-d9afc6d7-d3ca-4baa-8920-cb41b56ff0b8.jpeg"
  ],
  "aspect_ratio": "16:9",
  "duration": 5,
  "seed": 42,
  "bgm": True,
  "movement_amplitude": "auto"
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

promptstr * Affects Pricing

Defines the visual concept for the image. Use creative prompts to guide design style.

image_urlsstr *

URLs of source images for reference. Use one URL to create a video from an image.

aspect_ratioenum:str *

Sets video aspect ratio. Choose 1:1 for Instagram or 16:9 for widescreen.

Allowed values:

durationenum:integer *

Controls video length. Use 5 for short clips or 10 for more detailed presentation.

Allowed values:

seedint ( default: 42 )

Determines randomization for generation. Set to 42 for consistency or 0 for variation.

min : 0,

max : 999999

bgmbool ( default: true )

Adds background music to enhance video. Enable for dynamic content or disable for silent showcase.

movement_amplitudeenum:str ( default: auto )

Configures movement strength in video. 'Auto' for default or specify for custom control.

Allowed values:

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Vidu AI – Reference-to-Video Model

What is Vidu AI Reference to Video?

Vidu AI Reference to Video is a powerful generative platform that transforms one or more reference images into high-quality, multi-shot animated videos. It ensures visual consistency of characters, objects, or environments throughout the sequence, guided by detailed text prompts for style, motion, and transitions. Ideal for creators, marketers, and storytellers who want precise control over animation from their reference imagery.

Key Features

Upload single or multiple reference images for consistent multi-shot videos
Maintain character, object, and environment consistency across shots
Customize animation style, aspect ratio, and duration
Advanced seed control (seed: 0–999999) for reproducible outputs
Optional background music toggle (bgm: true/false)
Adjustable movement amplitude (movement_amplitude: auto)

Best Use Cases

Character Animations: Animate concept art or photos with consistent motion
Marketing Clips: Generate product demos using reference images
Storytelling: Create cinematic scenes with unified visual elements
Social Media: Produce vertical reels or square videos from reference art
E-Learning: Animate diagrams or instructional visuals into engaging videos

Prompt Tips and Output Quality

Provide clear, descriptive prompts to guide style and animation
Use high-resolution reference images for better texture and detail
Choose appropriate aspect ratio for target platform (e.g., 16:9, 9:16)
Fix the seed parameter to reproduce exact animation sequences
Enable or disable background music as needed (bgm)
Use movement_amplitude: auto for natural motion balance

FAQs

Q: How many reference images can I use?
A: You can upload one or multiple reference images to maintain consistency across different shots.

Q: Can I reproduce the same animation reliably?
A: Yes, by setting a fixed seed value, you get consistent animations every time.

Q: Is background music added automatically?
A: No, background music is optional and can be toggled on or off via the bgm parameter.

Q: What aspect ratios are supported?
A: Common presets like 16:9, 1:1, and 9:16 are supported for various platforms.

Other Popular Models

sdxl-img2img

SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

faceswap-v2

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

sdxl1.0-txt2img

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

sd2.1-faceswapper

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

Vidu Q1 Reference to Video

API

Attributes

Vidu AI – Reference-to-Video Model

What is Vidu AI Reference to Video?

Key Features

Best Use Cases

Prompt Tips and Output Quality

FAQs

Other Popular Models

sdxl-img2img

faceswap-v2

sdxl1.0-txt2img

sd2.1-faceswapper

Cookie settings