API
If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
return [image_url_to_base64(url) for url in image_urls]
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/vidu-q1-reference-to-video"
# Request payload
data = {
"prompt": "A middle-aged man sits on a wooden bench in a sunny park, reading an open hardcover book titled 'Vidu Q1 Reference'. The camera gently zooms in on the book’s cover and pages showing technical content, then back to the man turning pages thoughtfully. Soft sunlight filters through the trees, blending the three reference images naturally into one smooth scene.",
"image_urls": [
"https://segmind-resources.s3.amazonaws.com/input/62d98e27-8538-4953-95e6-e93d4a544fc1-39a80c5a-c4c4-405f-88f4-30fd1a45d6c9.jpeg",
"https://segmind-resources.s3.amazonaws.com/input/ac3266d0-271e-4e31-96a9-d31f26d8c22f-a052d794-4300-478d-8d99-5a10daf5200c.png",
"https://segmind-resources.s3.amazonaws.com/input/777c247c-8073-4d47-ac64-53b5fa088aae-d9afc6d7-d3ca-4baa-8920-cb41b56ff0b8.jpeg"
],
"aspect_ratio": "16:9",
"duration": 5,
"seed": 42,
"bgm": True,
"movement_amplitude": "auto"
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Attributes
Defines the visual concept for the image. Use creative prompts to guide design style.
URLs of source images for reference. Use one URL to create a video from an image.
Sets video aspect ratio. Choose 1:1 for Instagram or 16:9 for widescreen.
Allowed values:
Controls video length. Use 5 for short clips or 10 for more detailed presentation.
Allowed values:
Determines randomization for generation. Set to 42 for consistency or 0 for variation.
min : 0,
max : 999999
Adds background music to enhance video. Enable for dynamic content or disable for silent showcase.
Configures movement strength in video. 'Auto' for default or specify for custom control.
Allowed values:
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Vidu AI – Reference-to-Video Model
What is Vidu AI Reference to Video?
Vidu AI Reference to Video is a powerful generative platform that transforms one or more reference images into high-quality, multi-shot animated videos. It ensures visual consistency of characters, objects, or environments throughout the sequence, guided by detailed text prompts for style, motion, and transitions. Ideal for creators, marketers, and storytellers who want precise control over animation from their reference imagery.
Key Features
- Upload single or multiple reference images for consistent multi-shot videos
- Maintain character, object, and environment consistency across shots
- Customize animation style, aspect ratio, and duration
- Advanced seed control (
seed
: 0–999999) for reproducible outputs - Optional background music toggle (
bgm
: true/false) - Adjustable movement amplitude (
movement_amplitude
: auto)
Best Use Cases
- Character Animations: Animate concept art or photos with consistent motion
- Marketing Clips: Generate product demos using reference images
- Storytelling: Create cinematic scenes with unified visual elements
- Social Media: Produce vertical reels or square videos from reference art
- E-Learning: Animate diagrams or instructional visuals into engaging videos
Prompt Tips and Output Quality
- Provide clear, descriptive prompts to guide style and animation
- Use high-resolution reference images for better texture and detail
- Choose appropriate aspect ratio for target platform (e.g., 16:9, 9:16)
- Fix the
seed
parameter to reproduce exact animation sequences - Enable or disable background music as needed (
bgm
) - Use
movement_amplitude: auto
for natural motion balance
FAQs
Q: How many reference images can I use?
A: You can upload one or multiple reference images to maintain consistency across different shots.
Q: Can I reproduce the same animation reliably?
A: Yes, by setting a fixed seed
value, you get consistent animations every time.
Q: Is background music added automatically?
A: No, background music is optional and can be toggled on or off via the bgm
parameter.
Q: What aspect ratios are supported?
A: Common presets like 16:9
, 1:1
, and 9:16
are supported for various platforms.
Other Popular Models
sdxl-img2img
SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to generate new images from an input image using StableDiffusionImg2ImgPipeline from diffusers

faceswap-v2
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training

sdxl1.0-txt2img
The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

sd2.1-faceswapper
Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training
