Collections [52]
Browse curated collections of AI models — organized by use case, model family, and provider. Each collection groups the best models for a specific task or category.
Use Cases & Model Families
Best Video Models
Explore the best text-to-video and image-to-video AI models on Segmind. Compare Kling, Veo, Wan, LTX, and more via a single pay-per-use API.
Best Image Models
Compare the best AI image generation models — FLUX, SD3, Ideogram, Recraft, and more — via a single pay-per-use API on Segmind.
LoRA Generation
Generate stylized images with LoRA models on Segmind. Flux Realism LoRA, Vector LoRA, Multi-LoRA, and more via a simple API.
Speech Generation
Access the best TTS and voice synthesis models on Segmind — ElevenLabs, Gemini TTS, Chatterbox, and more via a single API.
Lipsync Avatar Models
Create talking avatar videos with AI lipsync models on Segmind. Kling Avatar, HeyGen, Sync.so, and more via a simple API.
Image Editing Models
Transform and edit images with AI on Segmind. FLUX Kontext, Qwen Image Edit, BRIA, and 30+ models via a single API.
Ultra Models
Segmind's Ultra Selection: the highest-quality AI models across image, video, audio, and more. Access via pay-per-use API.
First and Last Frame Video
Generate AI videos with precise first and last frame control. Kling, Wan, LTX, and more image-to-video models on Segmind.
Remove Anything
Remove backgrounds, objects, watermarks, and anything else with AI on Segmind. BRIA, Qwen Eraser, and more via a simple API.
Swap Anything
Swap faces, outfits, and styles with AI on Segmind. FaceSwap, Virtual Try-On, Style Transfer, and more via a single API.
Video Edit
Edit, restyle, and transform videos with AI on Segmind. Kling O3 Video Edit, LTX Retake, BRIA Video, and more via API.
3D Creation
Generate 3D objects and meshes from images with AI on Segmind. HunyuanVideo 3D, SAM 3D, and more via a simple API.
Upscale Image
Upscale images to 4x or 8x resolution with AI on Segmind. ClarityAI, Topaz, Nomos, and ESRGAN models via a simple API.
Generate Music
Generate original music with AI on Segmind. ACE-Step, MiniMax Music, MusicGen, and more via a simple pay-per-use API.
Enhance Videos
Enhance, upscale, and restore video quality with AI on Segmind. FlashVSR, frame interpolation, Topaz Video, and more via API.
Object Detection & Segmentation
Detect and segment objects with AI on Segmind. SAM, SAM3 Video, SAM V2, and more segmentation models via a simple API.
Motion Control Models
Generate AI videos with precise camera and motion control on Segmind. Kling Motion Control, MotionCtrl SVD, and more.
Audio for Video
Generate audio, sound effects, and music for your videos with AI on Segmind. TTS, audio isolation, and sound generation via API.
Content Detection Models
Detect NSFW content, watermarks, deepfakes, and more with AI on Segmind. Content detection models via a simple pay-per-use API.
Faceswap Models
Swap faces in images and videos with AI on Segmind. HyperSwap, FaceSwap V5, multi-face swap, and more via a simple API.
Qwen Image 2 Models
Access Qwen Image 2 generation and editing models on Segmind. Qwen Image Edit Plus, multi-LoRA, relighting, and more via a single API.
Wan 2.6 Models
Generate high-quality AI video with Wan 2.6 on Segmind. Text-to-video, image-to-video, and first/last-frame control via API.
Kling O3 Models
Generate and edit videos with Kling O3 on Segmind. Text-to-video, image-to-video, video edit, and reference modes via API.
Wan 2.5 Models
Access Wan 2.5 text-to-video and image-to-video models on Segmind. High-quality, fast video generation via a single API.
Seedream AI Models
Generate photorealistic images with multilingual text using ByteDance Seedream on Segmind. 4K output, image editing, and more via API.
Wan 2.2 Models
Access Wan 2.2 Fast and Flash video generation models on Segmind. Text-to-video and image-to-video via a simple API.
Dreamina AI Models
Access CapCut Dreamina AI image generation and editing models on Segmind. High-quality visuals for content creators via API.
Flux Image Tools
Access the complete FLUX model lineup on Segmind — Flux Dev, Pro, Ultra, Fill, Canny, Depth, Redux, and more via a single API.
Flux Kontext Models
Edit images with natural language using FLUX Kontext on Segmind. Kontext Dev, Pro, Max, and Multi-Image variants via a single API.
Wan 2.1 Video Models
Access Wan 2.1 text-to-video and image-to-video models in 480p and 720p on Segmind. Open-source video generation via API.
Hunyuan Models
Generate high-quality video and 3D assets with Tencent Hunyuan models on Segmind. HunyuanVideo, Hunyuan3D-2, and more via API.
Vidu Models
Generate reference-consistent AI video with Vidu on Segmind. Vidu Q1 Reference to Video and Vidu Template via a simple API.
Qwen AI Models
Access Alibaba Qwen image generation, editing, and vision-language models on Segmind. Qwen Image, Edit Plus, Qwen2-VL, and more via API.
Pixverse AI Models
Create dynamic AI videos with PixVerse on Segmind. Pixverse 5, effects, transitions, lipsync, and more via a pay-per-use API.

Veo Models
Generate cinematic videos with Google Veo models on Segmind. Compare Veo 2, Veo 3, and more.
Nano Banana
Try Nano Banana ultra-fast image generation models on Segmind. Optimized for speed and efficiency.
SeeDance Video Models
Create professional AI videos with Bytedance SeeDance models on Segmind.
Providers
Alibaba Models
Alibaba delivers a comprehensive multimodal AI suite — Qwen for image generation and editing, Wan for state-of-the-art video synthesis, and vision-language models for intelligent visual understanding. The Qwen Image Edit Plus series offers surgical image editing with multi-LoRA, relighting, product photography, and more. Wan 2.6 produces high-resolution video with smooth motion and keyframe control. Access every Alibaba model as a pay-per-use API on Segmind, or chain them in Segmind Workflows for automated content pipelines.
Black Forest Labs Models
Black Forest Labs created FLUX — one of the most capable and widely adopted text-to-image systems in the world. The FLUX lineup spans Flux Dev and Schnell for development, Flux Pro and Ultra for production, Flux 2 for the latest generation, plus specialized tools: Fill for inpainting, Canny and Depth for guided generation, Redux for style conditioning, and Kontext for instruction-following image editing. Access every FLUX model via Segmind APIs, or chain them in Segmind Workflows for automated image production.
Meta Models
Meta powers open-source AI leadership with the Llama model family, SAM for universal image segmentation, and MusicGen for AI music creation. Llama models deliver frontier language capabilities with full open-weight access. SAM and SAM3 segment any object in images and video with remarkable accuracy. MusicGen composes original tracks from text descriptions. Access every Meta model via Segmind APIs with a single endpoint call.
Bytedance Models
Bytedance offers a full creative AI suite — Seedream for photorealistic image generation with multilingual text rendering, Seedance for cinematic video synthesis, and Seededit for precise image editing without full regeneration. Together they cover the entire visual pipeline from concept to polished output. Access every model as a pay-per-use API on Segmind, or chain them in Segmind Workflows — generate, animate, and edit in one automated pipeline with zero infrastructure.

Bria Models
Bria delivers enterprise-grade visual AI trained exclusively on licensed data — full IP safety for commercial use. The lineup spans text-to-image, background removal and replacement, object erasing, upscaling, lifestyle product shots, and vector graphics. Ideal for e-commerce and marketing where compliance is non-negotiable. Use Segmind APIs to integrate any Bria model with a single endpoint call, or build Segmind Workflows that chain background removal, scene generation, and upscaling into one automated pipeline.
Ideogram Models
Ideogram leads in AI text rendering — logos, posters, and typography come out crisp every time. Beyond text, it excels at photorealism, illustration, and consistent character generation. Key models include Ideogram 3 for sharp text-to-image, remix and editing, background replacement, reframing, and upscaling. Use Segmind APIs to add accurate text-in-image generation to your product with one endpoint, or build Segmind Workflows that generate, reframe for multiple formats, and upscale — all in a single automated pipeline.

MiniMax Models
MiniMax spans video, live avatars, and music generation in one multimodal suite. Hailuo 2 produces cinematic video with smooth motion, Director offers scene-level camera and lighting control, Live creates real-time speaking avatars, and Music 01 composes original tracks from text. Perfect for content creators and media teams needing multiple creative modalities. Access all models via Segmind APIs, or use Segmind Workflows to generate video, add avatar narration, and compose a soundtrack in one automated pipeline.
ElevenLabs Models
ElevenLabs produces the most lifelike synthetic speech available — natural intonation and emotion across dozens of languages. Beyond TTS, it offers instant voice cloning, speech-to-speech conversion, multi-speaker dialogue with timestamps, audio isolation, and voice design from text descriptions. Ideal for podcasts, games, audiobooks, and accessibility. Integrate any ElevenLabs model via Segmind APIs with a single call, or build Segmind Workflows that clone voices, generate dialogue, and produce timestamped transcripts in one automated pipeline.

OpenAI Models
OpenAI powers the world's most adopted AI models. GPT-5 and GPT-4o deliver frontier language, reasoning, and multimodal capabilities. The o1 series handles complex step-by-step reasoning, while GPT Image 1 generates and edits photorealistic images with native text rendering. Fast variants like GPT-5 Mini serve high-throughput use cases. Access every model via Segmind APIs — no separate OpenAI account needed. Use Segmind Workflows to chain GPT for planning with GPT Image for visuals in one automated pipeline.

Google Models
Google offers the broadest multimodal AI portfolio — language, image, and video from one provider. Gemini 2.5 Pro and Flash deliver frontier reasoning with massive context windows. Imagen 4 produces photorealistic images with accurate text rendering. The Veo family (Veo 2, 3, 3.1) generates cinematic video with realistic motion and natural audio. Access Gemini, Imagen, and Veo via Segmind APIs — no Google Cloud credentials needed. Chain them in Segmind Workflows for end-to-end content pipelines from strategy to video, fully automated.
Kling Models
Kling leads in AI video generation with cinematic quality and creative special effects. Kling 2.5 Turbo delivers fast video from text or images; Kling 2.1 offers higher fidelity and longer durations. Image-to-video models animate stills into dynamic scenes. Kling Elements adds viral-ready effects — hugs, bloom, heart gestures — perfect for social media and marketing. Generate videos, apply effects, or create AI avatars via Segmind APIs, or use Segmind Workflows to automate full video pipelines in one sequence.
Luma Models
Luma Labs creates AI with deep physical-world understanding — exceptional realism in lighting, reflections, and motion. Ray 2 generates cinematic video with physically accurate dynamics. Photon produces photographic-quality stills with natural depth of field, ideal for product visualization and editorial content. Photon Flash offers the same quality at faster speeds. Integrate any Luma model via Segmind APIs with a single call, or chain Photon and Ray in Segmind Workflows to generate, animate, and resize — all in one automated pipeline.
Runway Models
Runway is the professional standard for AI video, trusted by Hollywood studios and agencies. Gen-4 Aleph delivers cinematic video with precise camera and scene control. Gen-4 Turbo enables near-real-time generation for iterative workflows. Gen-4 Image maintains style and character consistency across scenes using reference images. All models are tuned for narrative content and cinematic storytelling. Access every Runway model via Segmind APIs — no subscription needed — or combine them in Segmind Workflows for automated production pipelines.
Grok Models
Grok is xAI's flagship model built for truthfulness, real-time knowledge, and unfiltered reasoning. Grok 2 delivers sharp responses with strong coding, analysis, and creative performance. Grok 2 Vision adds multimodal understanding for image analysis and document comprehension. Both models access real-time information, making them ideal for current events and fact-checking. Add Grok to your app via Segmind APIs with a single call, or integrate it into Segmind Workflows alongside image and video models for intelligent multimodal automation.
Runway ML AI Models
Access Runway Gen-4 Aleph, Gen-4 Turbo, Gen-4 Image, and Gen-3 Alpha Turbo on Segmind. Professional AI video via pay-per-use API.