66
models
Motion Control Models
AI video generation models with fine-grained control over camera movement, subject motion, and scene dynamics — giving you directorial control over AI-generated video. This collection includes Kling 2.6 Pro Motion Control, Kling 2.6 Standard Motion Control, and MotionCtrl-SVD — models that accept explicit motion control signals to direct how the camera moves, how subjects animate, and how the scene evolves over time. Traditional AI video models generate motion stochastically, making it hard to get consistent camera behavior. Motion control models solve this by letting you specify trajectory paths, camera angles (pan, tilt, zoom, dolly), subject motion direction, and motion intensity — producing videos where the motion matches your creative intent precisely. Use cases include: product showcase videos with controlled 360° rotation, cinematic camera moves for storytelling, real estate walkthrough videos with directed camera paths, and any application where predictable, repeatable camera behavior is required. Kling's motion control models are particularly advanced, supporting per-frame motion brush inputs and trajectory-based animation. On Segmind, all motion control models are available via API — specify your motion parameters and receive a controlled, professional video output.
Wan 2.7 Text to Video
1080P cinematic videos with audio sync and multi-shot control.
Pixverse V6
15-second AI videos with native audio and cinematic controls.
Kling O3 Image To Video
Images to cinematic videos with precise motion control.
Kling 3.0 Pro Image-to-Video
Animated 1080p videos from images with dynamic motion.
Kling 3.0 Standard Image-to-Video
Controlled cinematic 1080p videos from starting images.
Kling O1
Text-to-video creation with precise AI-driven motion control.
Kling 2.6 Pro Motion Control
Transfer motion from videos to animate custom characters.
Kling 2.6 Standard Motion Control
Precise motion transfer from reference videos to characters.
Qwen Image Edit Plus Multi Lora
Multi-image editing with superior identity and style control.
Pruna P Image Edit
Multi-image editing with AI-guided precision and control.
Bria Fibo
Photorealistic images from structured prompts with brand control.
LTX 2 Pro
High-quality video generation with advanced motion control.
Hailuo 2.3
Hyper-realistic videos from text with fluid character motion.
Video Frame Interpolation
FILM synthesizes smooth, high-quality intermediate frames for fluid motion in videos with significant movement.
Bytedance HuMo: Human-Centric Video Generation
HuMo generates high-quality, human-centric videos from text, images, and audio with unparalleled control and precision.
Video Tryon
Video Tryon is Segmind’s next-generation AI video model for instant virtual try-on, allowing users to visualize any outfit on any person in high-quality, fully-preserved motion up to 50 seconds.
Higgsfield Text 2 Image Soul
SOUL AI transforms text into stunning, customizable visuals with unparalleled style control and precision.
Higgsfield Image 2 Video
Transform static images into dynamic, motion-rich videos with unparalleled control and creative depth.
Flux Krea Dev
FLUX.1 Krea generates stunning, photorealistic images with fine-tuned aesthetic control for diverse creative applications.
Minimax Hailou 2
Generate breathtaking 1080P cinematic videos from text or images with ultra-realistic motion and physics.
Vidu Template
Transform static images into captivating videos using diverse motion templates effortlessly.
Kling 2.1 AI Video Generator
Kling 2.1 offers hyper-realistic video generation with improved motion, sharper 1080p visuals, and instant restyling capabilities. Its cost-effective pricing and faster rendering times make it a game-changer for creators seeking cinematic-quality AI videos.
Lyria 2
Lyria 2 by Google DeepMind is an advanced model that generates high-fidelity 48kHz stereo instrumental music from text prompts or lyrics, offering precise control over tempo, key, mood, and structure.
Segmind FaceSwap Comic v1
FaceSwap Comic v1 is an AI-powered face swapping model designed to blend real faces into illustrated or cartoon-style images while preserving the target’s artistic look. Ideal for personalized children’s storybooks and stylized content, it offers fine control over facial expression, realism, and stylistic adaptation.
Kling bloombloom
Kling AI transforms text and images into dynamic, high-quality video content with realistic motion and sound.
Kling 2
Kling 2.0 is an advanced AI video generator (5 and 10 seconds) that creates cinematic, dynamic videos from text or images with lifelike motion and precise prompt control at 720p resolutions.
Dia (Text to Speech)
Dia by Nari Labs is an advanced open-weights TTS model that brings scripts to life with natural speech, emotions, and nonverbal cues. Easily control tone, voice, and delivery. Great alternative to ElevenLabs.
Pixverse Image to Video
Animate your photos effortlessly with Pixverse Image to Video AI! Upload, add motion prompts and styles.
Google Veo 2 Image To Video
Discover Google Veo 2, an AI-powered image-to-video model with 4K resolution, realistic motion, and cinematic effects for creators and developers.
Segmind Faceswap v4
Segmind FaceSwap v4 enables fast and precise face or head swapping between images with customizable options for style, output format, and image quality. Designed for creators and designers, it ensures natural-looking results with reproducibility through seed control for consistent outputs.
Hunyuan-3d 2mv
Hunyuan3D-2mv is finetuned from Hunyuan3D-2 to support multiview controlled shape generation.
Luma Ray flash 2 (720p)
Generate stunning 720p videos from text with the Luma ray-flash-2-720p model. Faster & cheaper than Ray 2, offering realistic motion & detail.
Google Veo 2
Create stunning, realistic videos with Veo 2, Google's state-of-the-art AI video generation model. Experience enhanced quality & cinematic control.
Minimax-image-01
Generate high-fidelity images from text with precise control & stunning quality with Minimax Image-01.
Wan_2.1 Text to Video
Create visually impressive and feature varied, lifelike motion videos with Wan2.1 using text prompts.
Wan 2.1 480p image to video
Create high-quality 480p videos with excellent visual quality and a broad spectrum of motion from static images.
Wan 2.1 720p image to video
Create high-quality 720p videos with excellent visual quality and a broad spectrum of motion from static images.
Minimax AI Director
Minimax video-01-director: Create high-quality videos with control camera movements precisely using text prompts.
AI Face Swap (image and video)
AI Face Swap: Effortlessly replace faces online. Fine-tune swaps with advanced controls for age, gender, and resolution.
Minimax (Hailuo) Video-01-live
Create stunning animations with Minimax (Hailuo) video-01-live, an AI image-to-video model perfect for Live2D, anime, and more. Transform static images into dynamic videos with smooth motion, facial control, and style support for diverse use cases like art, character animation, and e-commerce.
Omini Control
OminiControl is an innovative framework that optimizes Diffusion Transformer models for versatile image generation tasks.
Flux Canny Pro
Professional edge-guided image generation. Control structure and composition using Canny edge detection
Flux Canny Dev
Open-weight edge-guided image generation. Control structure and composition using Canny edge detection.
Mochi 1
Mochi 1 is a cutting-edge, open-source AI model that transforms text prompts into stunning, high-fidelity videos. Create captivating videos from simple text prompts with unparalleled quality and realism. Experience high-fidelity motion, strong prompt adherence, and limitless creative possibilities
Runway Gen Alpha Turbo Image to Video
Runway Gen-3 AlphaTurbo is a cutting-edge AI tool that transforms static images into dynamic videos with exceptional fidelity and motion
Openvoice
OpenVoice is a versatile voice cloning model that supports multiple languages and offers precise tone replication, flexible style control, and zero-shot cross-lingual capabilities
Flux Controlnets
Flux ControlNets is a collection of models that gives you precise control over image generation. By integrating ControlNet with Flux.1, these models enable you to create highly detailed and customized images with unprecedented accuracy.
SD3 Medium Tile Controlnet
SD3 Medium Tile ControlNet is a large generative image model designed for generating detailed images based on textual prompts and tile-based input images.
SD3 Medium Canny Controlnet
Stable Diffusion 3 (SD3) Medium Canny ControlNet uses Canny edge detection to provide fine-grained control over the generated outputs.
SD3 Medium Pose Controlnet
Stable Diffusion 3 (SD3) Pose ControlNet is a large generative image model tailored for generating images based on text prompts while using pose information as guidance.