Collections [52]

Browse curated collections of AI models — organized by use case, model family, and provider. Each collection groups the best models for a specific task or category.

Use Cases & Model Families

Best Video Models

Explore the best text-to-video and image-to-video AI models on Segmind. Compare Kling, Veo, Wan, LTX, and more via a single pay-per-use API.

video

Best Image Models

Compare the best AI image generation models — FLUX, SD3, Ideogram, Recraft, and more — via a single pay-per-use API on Segmind.

text-to-image

LoRA Generation

Generate stylized images with LoRA models on Segmind. Flux Realism LoRA, Vector LoRA, Multi-LoRA, and more via a simple API.

lora

Speech Generation

Access the best TTS and voice synthesis models on Segmind — ElevenLabs, Gemini TTS, Chatterbox, and more via a single API.

text-to-audio

Lipsync Avatar Models

Create talking avatar videos with AI lipsync models on Segmind. Kling Avatar, HeyGen, Sync.so, and more via a simple API.

avatar lipsync

Image Editing Models

Transform and edit images with AI on Segmind. FLUX Kontext, Qwen Image Edit, BRIA, and 30+ models via a single API.

image-to-imageinpainting

Ultra Models

Segmind's Ultra Selection: the highest-quality AI models across image, video, audio, and more. Access via pay-per-use API.

ultra

First and Last Frame Video

Generate AI videos with precise first and last frame control. Kling, Wan, LTX, and more image-to-video models on Segmind.

image-to-video

Remove Anything

Remove backgrounds, objects, watermarks, and anything else with AI on Segmind. BRIA, Qwen Eraser, and more via a simple API.

remove erase background

Swap Anything

Swap faces, outfits, and styles with AI on Segmind. FaceSwap, Virtual Try-On, Style Transfer, and more via a single API.

swap

Video Edit

Edit, restyle, and transform videos with AI on Segmind. Kling O3 Video Edit, LTX Retake, BRIA Video, and more via API.

video-to-video

3D Creation

Generate 3D objects and meshes from images with AI on Segmind. HunyuanVideo 3D, SAM 3D, and more via a simple API.

image-to-3d

Upscale Image

Upscale images to 4x or 8x resolution with AI on Segmind. ClarityAI, Topaz, Nomos, and ESRGAN models via a simple API.

upscale

Generate Music

Generate original music with AI on Segmind. ACE-Step, MiniMax Music, MusicGen, and more via a simple pay-per-use API.

music

Enhance Videos

Enhance, upscale, and restore video quality with AI on Segmind. FlashVSR, frame interpolation, Topaz Video, and more via API.

enhance video upscale interpolat

Object Detection & Segmentation

Detect and segment objects with AI on Segmind. SAM, SAM3 Video, SAM V2, and more segmentation models via a simple API.

detection segmentation

Motion Control Models

Generate AI videos with precise camera and motion control on Segmind. Kling Motion Control, MotionCtrl SVD, and more.

motion control

Audio for Video

Generate audio, sound effects, and music for your videos with AI on Segmind. TTS, audio isolation, and sound generation via API.

text-to-audio

Content Detection Models

Detect NSFW content, watermarks, deepfakes, and more with AI on Segmind. Content detection models via a simple pay-per-use API.

detection

Faceswap Models

Swap faces in images and videos with AI on Segmind. HyperSwap, FaceSwap V5, multi-face swap, and more via a simple API.

faceswap

Qwen Image 2 Models

Access Qwen Image 2 generation and editing models on Segmind. Qwen Image Edit Plus, multi-LoRA, relighting, and more via a single API.

qwen image

Wan 2.6 Models

Generate high-quality AI video with Wan 2.6 on Segmind. Text-to-video, image-to-video, and first/last-frame control via API.

wan 2.6
Kling logo

Kling O3 Models

Generate and edit videos with Kling O3 on Segmind. Text-to-video, image-to-video, video edit, and reference modes via API.

kling o3

Wan 2.5 Models

Access Wan 2.5 text-to-video and image-to-video models on Segmind. High-quality, fast video generation via a single API.

wan 2.5
ByteDance logo

Seedream AI Models

Generate photorealistic images with multilingual text using ByteDance Seedream on Segmind. 4K output, image editing, and more via API.

seedream

Wan 2.2 Models

Access Wan 2.2 Fast and Flash video generation models on Segmind. Text-to-video and image-to-video via a simple API.

wan 2.2

Dreamina AI Models

Access CapCut Dreamina AI image generation and editing models on Segmind. High-quality visuals for content creators via API.

dreamina

Flux Image Tools

Access the complete FLUX model lineup on Segmind — Flux Dev, Pro, Ultra, Fill, Canny, Depth, Redux, and more via a single API.

flux

Flux Kontext Models

Edit images with natural language using FLUX Kontext on Segmind. Kontext Dev, Pro, Max, and Multi-Image variants via a single API.

kontext

Wan 2.1 Video Models

Access Wan 2.1 text-to-video and image-to-video models in 480p and 720p on Segmind. Open-source video generation via API.

wan 2.1

Hunyuan Models

Generate high-quality video and 3D assets with Tencent Hunyuan models on Segmind. HunyuanVideo, Hunyuan3D-2, and more via API.

hunyuan

Vidu Models

Generate reference-consistent AI video with Vidu on Segmind. Vidu Q1 Reference to Video and Vidu Template via a simple API.

vidu

Qwen AI Models

Access Alibaba Qwen image generation, editing, and vision-language models on Segmind. Qwen Image, Edit Plus, Qwen2-VL, and more via API.

qwen

Pixverse AI Models

Create dynamic AI videos with PixVerse on Segmind. Pixverse 5, effects, transitions, lipsync, and more via a pay-per-use API.

pixverse
Google logo

Veo Models

Generate cinematic videos with Google Veo models on Segmind. Compare Veo 2, Veo 3, and more.

veo

Nano Banana

Try Nano Banana ultra-fast image generation models on Segmind. Optimized for speed and efficiency.

nano banana
Bytedance logo

SeeDance Video Models

Create professional AI videos with Bytedance SeeDance models on Segmind.

seedance

Providers

Alibaba logo

Alibaba Models

Alibaba delivers a comprehensive multimodal AI suite — Qwen for image generation and editing, Wan for state-of-the-art video synthesis, and vision-language models for intelligent visual understanding. The Qwen Image Edit Plus series offers surgical image editing with multi-LoRA, relighting, product photography, and more. Wan 2.6 produces high-resolution video with smooth motion and keyframe control. Access every Alibaba model as a pay-per-use API on Segmind, or chain them in Segmind Workflows for automated content pipelines.

alibaba
Black Forest Labs logo

Black Forest Labs Models

Black Forest Labs created FLUX — one of the most capable and widely adopted text-to-image systems in the world. The FLUX lineup spans Flux Dev and Schnell for development, Flux Pro and Ultra for production, Flux 2 for the latest generation, plus specialized tools: Fill for inpainting, Canny and Depth for guided generation, Redux for style conditioning, and Kontext for instruction-following image editing. Access every FLUX model via Segmind APIs, or chain them in Segmind Workflows for automated image production.

black-forest-labs
Meta logo

Meta Models

Meta powers open-source AI leadership with the Llama model family, SAM for universal image segmentation, and MusicGen for AI music creation. Llama models deliver frontier language capabilities with full open-weight access. SAM and SAM3 segment any object in images and video with remarkable accuracy. MusicGen composes original tracks from text descriptions. Access every Meta model via Segmind APIs with a single endpoint call.

meta
Bytedance logo

Bytedance Models

Bytedance offers a full creative AI suite — Seedream for photorealistic image generation with multilingual text rendering, Seedance for cinematic video synthesis, and Seededit for precise image editing without full regeneration. Together they cover the entire visual pipeline from concept to polished output. Access every model as a pay-per-use API on Segmind, or chain them in Segmind Workflows — generate, animate, and edit in one automated pipeline with zero infrastructure.

bytedance
Bria logo

Bria Models

Bria delivers enterprise-grade visual AI trained exclusively on licensed data — full IP safety for commercial use. The lineup spans text-to-image, background removal and replacement, object erasing, upscaling, lifestyle product shots, and vector graphics. Ideal for e-commerce and marketing where compliance is non-negotiable. Use Segmind APIs to integrate any Bria model with a single endpoint call, or build Segmind Workflows that chain background removal, scene generation, and upscaling into one automated pipeline.

bria
Ideogram logo

Ideogram Models

Ideogram leads in AI text rendering — logos, posters, and typography come out crisp every time. Beyond text, it excels at photorealism, illustration, and consistent character generation. Key models include Ideogram 3 for sharp text-to-image, remix and editing, background replacement, reframing, and upscaling. Use Segmind APIs to add accurate text-in-image generation to your product with one endpoint, or build Segmind Workflows that generate, reframe for multiple formats, and upscale — all in a single automated pipeline.

ideogram
MiniMax logo

MiniMax Models

MiniMax spans video, live avatars, and music generation in one multimodal suite. Hailuo 2 produces cinematic video with smooth motion, Director offers scene-level camera and lighting control, Live creates real-time speaking avatars, and Music 01 composes original tracks from text. Perfect for content creators and media teams needing multiple creative modalities. Access all models via Segmind APIs, or use Segmind Workflows to generate video, add avatar narration, and compose a soundtrack in one automated pipeline.

minimax
ElevenLabs logo

ElevenLabs Models

ElevenLabs produces the most lifelike synthetic speech available — natural intonation and emotion across dozens of languages. Beyond TTS, it offers instant voice cloning, speech-to-speech conversion, multi-speaker dialogue with timestamps, audio isolation, and voice design from text descriptions. Ideal for podcasts, games, audiobooks, and accessibility. Integrate any ElevenLabs model via Segmind APIs with a single call, or build Segmind Workflows that clone voices, generate dialogue, and produce timestamped transcripts in one automated pipeline.

elevenlabs
OpenAI logo

OpenAI Models

OpenAI powers the world's most adopted AI models. GPT-5 and GPT-4o deliver frontier language, reasoning, and multimodal capabilities. The o1 series handles complex step-by-step reasoning, while GPT Image 1 generates and edits photorealistic images with native text rendering. Fast variants like GPT-5 Mini serve high-throughput use cases. Access every model via Segmind APIs — no separate OpenAI account needed. Use Segmind Workflows to chain GPT for planning with GPT Image for visuals in one automated pipeline.

openai
Google logo

Google Models

Google offers the broadest multimodal AI portfolio — language, image, and video from one provider. Gemini 2.5 Pro and Flash deliver frontier reasoning with massive context windows. Imagen 4 produces photorealistic images with accurate text rendering. The Veo family (Veo 2, 3, 3.1) generates cinematic video with realistic motion and natural audio. Access Gemini, Imagen, and Veo via Segmind APIs — no Google Cloud credentials needed. Chain them in Segmind Workflows for end-to-end content pipelines from strategy to video, fully automated.

google
Kling logo

Kling Models

Kling leads in AI video generation with cinematic quality and creative special effects. Kling 2.5 Turbo delivers fast video from text or images; Kling 2.1 offers higher fidelity and longer durations. Image-to-video models animate stills into dynamic scenes. Kling Elements adds viral-ready effects — hugs, bloom, heart gestures — perfect for social media and marketing. Generate videos, apply effects, or create AI avatars via Segmind APIs, or use Segmind Workflows to automate full video pipelines in one sequence.

kling
Luma logo

Luma Models

Luma Labs creates AI with deep physical-world understanding — exceptional realism in lighting, reflections, and motion. Ray 2 generates cinematic video with physically accurate dynamics. Photon produces photographic-quality stills with natural depth of field, ideal for product visualization and editorial content. Photon Flash offers the same quality at faster speeds. Integrate any Luma model via Segmind APIs with a single call, or chain Photon and Ray in Segmind Workflows to generate, animate, and resize — all in one automated pipeline.

luma
Runway logo

Runway Models

Runway is the professional standard for AI video, trusted by Hollywood studios and agencies. Gen-4 Aleph delivers cinematic video with precise camera and scene control. Gen-4 Turbo enables near-real-time generation for iterative workflows. Gen-4 Image maintains style and character consistency across scenes using reference images. All models are tuned for narrative content and cinematic storytelling. Access every Runway model via Segmind APIs — no subscription needed — or combine them in Segmind Workflows for automated production pipelines.

runway
Grok logo

Grok Models

Grok is xAI's flagship model built for truthfulness, real-time knowledge, and unfiltered reasoning. Grok 2 delivers sharp responses with strong coding, analysis, and creative performance. Grok 2 Vision adds multimodal understanding for image analysis and document comprehension. Both models access real-time information, making them ideal for current events and fact-checking. Add Grok to your app via Segmind APIs with a single call, or integrate it into Segmind Workflows alongside image and video models for intelligent multimodal automation.

grok
Runway logo

Runway ML AI Models

Access Runway Gen-4 Aleph, Gen-4 Turbo, Gen-4 Image, and Gen-3 Alpha Turbo on Segmind. Professional AI video via pay-per-use API.

runway ml