# Segmind API — Model Directory > Segmind provides serverless GPU inference APIs for 200+ generative AI models including image generation, video generation, audio, LLMs, and more. Pay-per-use pricing with no infrastructure to manage. - **Base URL**: `https://api.segmind.com/v1/{model_slug}` - **Authentication**: Bearer token (API key) - **Docs**: https://docs.segmind.com For detailed API documentation, parameters, pricing, and code examples for any model, fetch: `https://www.segmind.com/models/{slug}/llms.txt` --- ## Text Generation (LLM) | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Claude 3 Haiku | claude-3-haiku | Claude 3 Haiku, the fastest and most cost-effective model LLM from Anthropic, delivers instant responses and image analy | Text Generation (LLM) | $0.005565155941967141 | 4.326s | [llms.txt](https://www.segmind.com/models/claude-3-haiku/llms.txt) | | Claude 3 Opus | claude-3-opus | Claude 3 Opus is an LLM pushing the limits of language understanding. It excels at complex tasks, generates human-qualit | Text Generation (LLM) | $0.08819386837881218 | 25.017s | [llms.txt](https://www.segmind.com/models/claude-3-opus/llms.txt) | | Claude 3.5 Sonnet | claude-3.5-sonnet | Claude 3.5 Sonnet represents a significant advancement in AI language models, combining speed, accuracy, and visual reas | Text Generation (LLM) | $0.03664767260936965 | 11.20302s | [llms.txt](https://www.segmind.com/models/claude-3.5-sonnet/llms.txt) | | Claude 4 Sonnet | claude-4-sonnet | Advanced coding and multi-step agentic reasoning model. | Text Generation (LLM) | $0.021433145101663584 | 11.23375s | [llms.txt](https://www.segmind.com/models/claude-4-sonnet/llms.txt) | | Claude 4.5 Sonnet | claude-4.5-sonnet | Claude Sonnet 4.5 empowers developers with advanced coding and reasoning for complex software solutions. | Text Generation (LLM) | $0.04551593101182655 | 23.52418s | [llms.txt](https://www.segmind.com/models/claude-4.5-sonnet/llms.txt) | | Claude Opus 4.7 | claude-opus-4.7 | Anthropic's most capable AI model excelling at agentic coding, complex reasoning, and high-resolution vision with a 1M-t | Text Generation (LLM) | $0.05 | - | [llms.txt](https://www.segmind.com/models/claude-opus-4.7/llms.txt) | | DeepSeek Chat | deepseek-chat | DeepSeek V3 combines cutting-edge AI technology with practical usability. Featuring a 671B parameter architecture, enhan | Text Generation (LLM) | $0.0012820433904013474 | 34.10292s | [llms.txt](https://www.segmind.com/models/deepseek-chat/llms.txt) | | DeepSeek R1 | deepseek-reasoner | DeepSeek-R1 is a cutting-edge AI reasoning model that combines reinforcement learning with supervised fine-tuning. Excel | Text Generation (LLM) | $0.04687098227507291 | 67.21856s | [llms.txt](https://www.segmind.com/models/deepseek-reasoner/llms.txt) | | Gemini 2 Flash | gemini-2-flash-image-generation | With Gemini 2 Flash, create consistent visuals, edit images conversationally, and render text accurately. | Text Generation (LLM) | $0.05416053896353166 | 8.60779s | [llms.txt](https://www.segmind.com/models/gemini-2-flash-image-generation/llms.txt) | | Gemini 2.5 Flash | gemini-2.5-flash | Multimodal AI with transparent reasoning, fast and affordable. | Text Generation (LLM) | $0.003799036607142857 | 11.07565s | [llms.txt](https://www.segmind.com/models/gemini-2.5-flash/llms.txt) | | Gemini 2.5 PRO | gemini-2.5-pro | Complex multimodal reasoning across diverse inputs and formats. | Text Generation (LLM) | $0.025429340671641796 | 27.37359s | [llms.txt](https://www.segmind.com/models/gemini-2.5-pro/llms.txt) | | Gemini 3 Pro | gemini-3-pro | Autonomous multimodal AI for complex reasoning and coding. | Text Generation (LLM) | $0.0031178443433199786 | 31.44143s | [llms.txt](https://www.segmind.com/models/gemini-3-pro/llms.txt) | | GPT 4 | gpt-4 | GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have be | Text Generation (LLM) | $0.028924545905172412 | 13.14806s | [llms.txt](https://www.segmind.com/models/gpt-4/llms.txt) | | GPT 4 turbo | gpt-4-turbo | GPT-4 outperforms both previous large language models and as of 2023, most state-of-the-art systems (which often have be | Text Generation (LLM) | $0.005317002659762601 | 9.90926s | [llms.txt](https://www.segmind.com/models/gpt-4-turbo/llms.txt) | | GPT 4o | gpt-4o | GPT-4o (“o” for “omni”) is our most advanced model. It is multimodal (accepting text or image inputs and outputting text | Text Generation (LLM) | $0.007496491503571936 | 6.0069s | [llms.txt](https://www.segmind.com/models/gpt-4o/llms.txt) | | GPT 5 | gpt-5 | GPT-5 automates complex coding tasks with integrated tools for seamless software development and deployment. | Text Generation (LLM) | $0.04185687687634024 | 61.68263s | [llms.txt](https://www.segmind.com/models/gpt-5/llms.txt) | | GPT 5 Mini | gpt-5-mini | Rapid high-quality AI across text, images, and files. | Text Generation (LLM) | $0.01020766289791438 | 47.90171s | [llms.txt](https://www.segmind.com/models/gpt-5-mini/llms.txt) | | GPT 5 Nano | gpt-5-nano | Ultra-fast LLM responses for real-time AI applications. | Text Generation (LLM) | $0.0010103470974352404 | 30.08176s | [llms.txt](https://www.segmind.com/models/gpt-5-nano/llms.txt) | | GPT 5.1 | gpt-5.1 | Precise code review and developer workflow assistant. | Text Generation (LLM) | $0.0072179697892271666 | 9.00481s | [llms.txt](https://www.segmind.com/models/gpt-5.1/llms.txt) | | GPT 5.2 | gpt-5.2 | Advanced reasoning with multimodal input for precise tasks. | Text Generation (LLM) | $0.012528342994100295 | 20.53192s | [llms.txt](https://www.segmind.com/models/gpt-5.2/llms.txt) | | GPT 5.4 | gpt-5.4 | Most powerful GPT for frontier reasoning and multimodal tasks. | Text Generation (LLM) | $0.009438483055555556 | 4.55805s | [llms.txt](https://www.segmind.com/models/gpt-5.4/llms.txt) | | GPT 5.4 Mini | gpt-5.4-mini | Fastest efficient model for coding and computer-use tasks. | Text Generation (LLM) | $0.0005428316746203904 | 5.14682s | [llms.txt](https://www.segmind.com/models/gpt-5.4-mini/llms.txt) | | GPT 5.4 Nano | gpt-5.4-nano | Flagship-class AI for classification and extraction tasks. | Text Generation (LLM) | $0.0029356314497417857 | 3.50369s | [llms.txt](https://www.segmind.com/models/gpt-5.4-nano/llms.txt) | | Grok 2 | grok-2 | Grok-2, xAI's latest language model, boasts superior reasoning, coding, and chat capabilities, outperforming many popula | Text Generation (LLM) | $0.0046698497191011235 | 8.51163s | [llms.txt](https://www.segmind.com/models/grok-2/llms.txt) | | Grok 2 Vision | grok-2-vision | Grok-2, xAI's latest language model with vision understanding. | Text Generation (LLM) | $0.003865039487565939 | 5.03739s | [llms.txt](https://www.segmind.com/models/grok-2-vision/llms.txt) | | Kimi K2 Instruct 0905 | kimi-k2-instruct-0905 | Deep contextual understanding and complex code generation. | Text Generation (LLM) | $0.001018350267379679 | 2.16009s | [llms.txt](https://www.segmind.com/models/kimi-k2-instruct-0905/llms.txt) | | Llama 3 70b | llama-v3-70b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.001026968779153933 | 2.38864s | [llms.txt](https://www.segmind.com/models/llama-v3-70b-instruct/llms.txt) | | Llama 3 8b | llama-v3-8b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.0006315522434414854 | 1.33946s | [llms.txt](https://www.segmind.com/models/llama-v3-8b-instruct/llms.txt) | | Llama 3.1 405b | llama-v3p1-405b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.006355001054333309 | 6.60671s | [llms.txt](https://www.segmind.com/models/llama-v3p1-405b-instruct/llms.txt) | | Llama 3.1 70b | llama-v3p1-70b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.0019101056600713583 | 2.70658s | [llms.txt](https://www.segmind.com/models/llama-v3p1-70b-instruct/llms.txt) | | Llama 3.1 8b | llama-v3p1-8b-instruct | Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and inst | Text Generation (LLM) | $0.00002702273886072785 | 1.66543s | [llms.txt](https://www.segmind.com/models/llama-v3p1-8b-instruct/llms.txt) | | Llama 4 Maverick Instruct Basic | llama4-maverick-instruct-basic | Llama 4 Maverick Instruct Basic is a 400B parameter powerhouse with 128 experts for unparalleled text and image understa | Text Generation (LLM) | $0.0010211187845303867 | 2.47979s | [llms.txt](https://www.segmind.com/models/llama4-maverick-instruct-basic/llms.txt) | | Llama 4 Scout Instruct Basic | llama4-scout-instruct-basic | Unlock powerful multimodal AI with Llama 4 Scout basic, a 17 billion active parameters model offering leading text & ima | Text Generation (LLM) | $0.0007072039722329349 | 2.66331s | [llms.txt](https://www.segmind.com/models/llama4-scout-instruct-basic/llms.txt) | | Mixtral 8x22b | mixtral-8x22b-instruct | Mistral MoE 8x22B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following. | Text Generation (LLM) | $0.0005056373593842508 | 2.73773s | [llms.txt](https://www.segmind.com/models/mixtral-8x22b-instruct/llms.txt) | | Mixtral 8x7b | mixtral-8x7b-instruct | Mistral MoE 8x7B Instruct v0.1 model with Sparse Mixture of Experts. Fine tuned for instruction following. | Text Generation (LLM) | $0.0002638041493775933 | 1.8292s | [llms.txt](https://www.segmind.com/models/mixtral-8x7b-instruct/llms.txt) | | O4 Mini | o4-mini | OpenAI o4-mini enhances decision-making by processing text and images with advanced reasoning capabilities. | Text Generation (LLM) | $0.006826813091158328 | 12.00599s | [llms.txt](https://www.segmind.com/models/o4-mini/llms.txt) | | OpenAI o1-mini | o1-mini | o1-mini by OpenAI provides high-performance reasoning and coding capabilities. Ideal for developers and businesses seeki | Text Generation (LLM) | $0.08289552217591578 | 30.12678s | [llms.txt](https://www.segmind.com/models/o1-mini/llms.txt) | | OpenAI o1-preview | o1-preview | o1-preview by OpenAI, is a powerful AI model that can tackle complex problems with exceptional accuracy and efficiency. | Text Generation (LLM) | $0.34304200481254693 | 51.96905s | [llms.txt](https://www.segmind.com/models/o1-preview/llms.txt) | | OpenAI o3 | o3 | Frontier reasoning model for complex coding, math, and science. | Text Generation (LLM) | $0.0050085 | 10.96443s | [llms.txt](https://www.segmind.com/models/o3/llms.txt) | | OpenAI o3 Mini | o3-mini | Cost-efficient reasoning model for coding, math, and science. | Text Generation (LLM) | $0.00138958 | 3.26736s | [llms.txt](https://www.segmind.com/models/o3-mini/llms.txt) | | QVQ Max | qvq-max | Chain-of-thought visual reasoning for math, charts, and diagrams. | Text Generation (LLM) | $0.0075934999999999996 | 27.5905s | [llms.txt](https://www.segmind.com/models/qvq-max/llms.txt) | | Qwen 3 Coder Flash | qwen3-coder-flash | High-volume code generation with 1M token context window. | Text Generation (LLM) | $0.0007448666666666667 | 4.72234s | [llms.txt](https://www.segmind.com/models/qwen3-coder-flash/llms.txt) | | Qwen 3 Coder Plus | qwen3-coder-plus | Generates, debugs, and refactors entire codebases efficiently. | Text Generation (LLM) | $0.0011938 | 3.80229s | [llms.txt](https://www.segmind.com/models/qwen3-coder-plus/llms.txt) | | Qwen 3 Max | qwen3-max | 1T-parameter LLM with hybrid reasoning and 262K context. | Text Generation (LLM) | $0.00105925 | 5.62272s | [llms.txt](https://www.segmind.com/models/qwen3-max/llms.txt) | | Qwen 3 VL Flash | qwen3-vl-flash | Fast, affordable vision-language model with 262K context OCR. | Text Generation (LLM) | $0.00019149999999999997 | 3.97372s | [llms.txt](https://www.segmind.com/models/qwen3-vl-flash/llms.txt) | | Qwen 3 VL Plus | qwen3-vl-plus | Powerful visual QA and document analysis from images. | Text Generation (LLM) | $0.0009447333333333334 | 7.08631s | [llms.txt](https://www.segmind.com/models/qwen3-vl-plus/llms.txt) | | Qwen 3.5 Flash | qwen3.5-flash | Fast multimodal AI processing text, images, and video affordably. | Text Generation (LLM) | $0.0007941999999999999 | 31.56291s | [llms.txt](https://www.segmind.com/models/qwen3.5-flash/llms.txt) | | Qwen 3.5 Plus | qwen3.5-plus | Multimodal 1M context AI for image, video, and text. | Text Generation (LLM) | $0.003051714285714286 | 17.11801s | [llms.txt](https://www.segmind.com/models/qwen3.5-plus/llms.txt) | | Qwen Flash | qwen-flash | Fastest low-cost LLM with 1M context for high-volume tasks. | Text Generation (LLM) | $0.00011157241379310346 | 4.61847s | [llms.txt](https://www.segmind.com/models/qwen-flash/llms.txt) | | Qwen Plus | qwen-plus | Mid-tier 1M context LLM for summarization and content tasks. | Text Generation (LLM) | $0.00014566666666666667 | 2.12671s | [llms.txt](https://www.segmind.com/models/qwen-plus/llms.txt) | | Qwen2 VL 72B Instruct | qwen2-vl-72b-instruct | Qwen2-VL-72B-Instruct is a state-of-the-art multimodal model excelling in image and video understanding, with advanced c | Text Generation (LLM) | $0.005773370372061172 | 7.14185s | [llms.txt](https://www.segmind.com/models/qwen2-vl-72b-instruct/llms.txt) | | QWEN2-VL-7B-Instruct | qwen2-vl-7b-instruct | The Qwen2-VL-7B-Instruct is a cutting-edge vision-language model with 7 billion parameters, offering advanced capabiliti | Text Generation (LLM) | $0.0012699288110324266 | 36.34406s | [llms.txt](https://www.segmind.com/models/qwen2-vl-7b-instruct/llms.txt) | | Qwen2.5-VL 32B Instruct | qwen2p5-vl-32b-instruct | Qwen2.5-VL processes text and images seamlessly for advanced multimodal instruction and reasoning. | Text Generation (LLM) | $0.0026978558645024703 | 7.5514s | [llms.txt](https://www.segmind.com/models/qwen2p5-vl-32b-instruct/llms.txt) | | QwQ Plus | qwq-plus | Deep chain-of-thought reasoning for math, code, and logic. | Text Generation (LLM) | $0.008212777777777777 | 56.4063s | [llms.txt](https://www.segmind.com/models/qwq-plus/llms.txt) | ## Image-to-Video Generation | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | AI Face Swap (image and video) | ai-face-swap | AI Face Swap: Effortlessly replace faces online. Fine-tune swaps with advanced controls for age, gender, and resolution. | Image-to-Video Generation | $0.10309356862457175 | 29.51768s | [llms.txt](https://www.segmind.com/models/ai-face-swap/llms.txt) | | Bytedance HuMo: Human-Centric Video Generation | bytedance-humo | HuMo generates high-quality, human-centric videos from text, images, and audio with unparalleled control and precision. | Image-to-Video Generation | $5 | - | [llms.txt](https://www.segmind.com/models/bytedance-humo/llms.txt) | | Cog videoX Image To Video | cog-video-5b-i2v | CogVideoX image-to-video is a cutting-edge AI model that converts static images into dynamic, high-quality videos. Perfe | Image-to-Video Generation | $0.35561489718190126 | 355.73991s | [llms.txt](https://www.segmind.com/models/cog-video-5b-i2v/llms.txt) | | Easy Animate | easy-animate | Easy Animate is a state-of-the-art image to animation model to convert static images into dynamic animations with remar | Image-to-Video Generation | $0.7981060098555378 | 208.63524s | [llms.txt](https://www.segmind.com/models/easy-animate/llms.txt) | | Google Veo 2 Image To Video | veo-2-image2video | Discover Google Veo 2, an AI-powered image-to-video model with 4K resolution, realistic motion, and cinematic effects fo | Image-to-Video Generation | $3.957629085982983 | 40.58279s | [llms.txt](https://www.segmind.com/models/veo-2-image2video/llms.txt) | | Hailuo 02 Fast | hailuo-02-fast | Transform any static image into a captivating, high-quality video clip effortlessly. | Image-to-Video Generation | $0.1612957434649876 | 79.67199s | [llms.txt](https://www.segmind.com/models/hailuo-02-fast/llms.txt) | | Hailuo 2.3 | hailuo-2.3 | Hyper-realistic videos from text with fluid character motion. | Image-to-Video Generation | $0.529004739336493 | 140.66875s | [llms.txt](https://www.segmind.com/models/hailuo-2.3/llms.txt) | | Hailuo 2.3 Fast | hailuo-2.3-fast | Professional-quality videos from text and images at speed. | Image-to-Video Generation | $0.3867293777134588 | 114.22468s | [llms.txt](https://www.segmind.com/models/hailuo-2.3-fast/llms.txt) | | Hallo | hallo | Hallo lets you create portrait videos from single images. | Image-to-Video Generation | $0.4217882754233411 | 303.61238s | [llms.txt](https://www.segmind.com/models/hallo/llms.txt) | | Heygen Avatar IV | heygen-avatar-iv | Single photo into a lifelike talking avatar video. | Image-to-Video Generation | $2.1682730471698113 | 215.82418s | [llms.txt](https://www.segmind.com/models/heygen-avatar-iv/llms.txt) | | Higgsfield Image 2 Video | higgsfield-image2video | Transform static images into dynamic, motion-rich videos with unparalleled control and creative depth. | Image-to-Video Generation | $0.6381414141414141 | 152.31308s | [llms.txt](https://www.segmind.com/models/higgsfield-image2video/llms.txt) | | Higgsfield Speech 2 Video | higgsfield-speech2video | Transform images and audio into dynamic, lip-synced videos for engaging digital content. | Image-to-Video Generation | $1.9714583333333338 | 290.66342s | [llms.txt](https://www.segmind.com/models/higgsfield-speech2video/llms.txt) | | HyperSwap: Video Faceswap by FaceFusion Labs | video-faceswap-by-facefusion-labs | Realistic face swapping in videos from a single image. | Image-to-Video Generation | $0.08616183774834436 | 55.44278s | [llms.txt](https://www.segmind.com/models/video-faceswap-by-facefusion-labs/llms.txt) | | InfiniteTalk | infinite-talk | Full-body animation from images synchronized perfectly to audio. | Image-to-Video Generation | $0.46024858417659986 | 301.51975s | [llms.txt](https://www.segmind.com/models/infinite-talk/llms.txt) | | Kling 2 | kling-2 | Kling 2.0 is an advanced AI video generator (5 and 10 seconds) that creates cinematic, dynamic videos from text or image | Image-to-Video Generation | $2.2962962962962963 | 305.2509s | [llms.txt](https://www.segmind.com/models/kling-2/llms.txt) | | Kling 2.1 AI Video Generator | kling-2.1 | Kling 2.1 offers hyper-realistic video generation with improved motion, sharper 1080p visuals, and instant restyling cap | Image-to-Video Generation | $0.9007380707174735 | 135.89172s | [llms.txt](https://www.segmind.com/models/kling-2.1/llms.txt) | | Kling 2.5 Turbo | kling-2.5-turbo | Kling AI 2.5 Turbo generates fluid, cinematic videos from text and images, enhancing content creation and storytelling. | Image-to-Video Generation | $0.56640350877193 | 134.74241s | [llms.txt](https://www.segmind.com/models/kling-2.5-turbo/llms.txt) | | Kling 2.6 | kling-2.6 | Still images into immersive cinematic videos with synchronized audio. | Image-to-Video Generation | $1.1078084832904884 | 122.43048s | [llms.txt](https://www.segmind.com/models/kling-2.6/llms.txt) | | Kling 3.0 Pro Image-to-Video | kling-3-pro-image2video | Animated 1080p videos from images with dynamic motion. | Image-to-Video Generation | $1.9405611510791374 | 291.36004s | [llms.txt](https://www.segmind.com/models/kling-3-pro-image2video/llms.txt) | | Kling 3.0 Standard Image-to-Video | kling-3-standard-image2video | Controlled cinematic 1080p videos from starting images. | Image-to-Video Generation | $1.2414769230769234 | 150.59963s | [llms.txt](https://www.segmind.com/models/kling-3-standard-image2video/llms.txt) | | Kling AI 1.6 Image to Video | kling-1.6-image2video | Kling AI 1.6 Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Creat | Image-to-Video Generation | $1.0528155725494146 | 289.3756s | [llms.txt](https://www.segmind.com/models/kling-1.6-image2video/llms.txt) | | Kling AI Image to Video | kling-image2video | Kling AI Image-to-Video is a powerful AI tool that transforms static images into captivating, animated videos. Create hi | Image-to-Video Generation | $0.8641459524125369 | 317.54633s | [llms.txt](https://www.segmind.com/models/kling-image2video/llms.txt) | | Kling Avatar V2 Standard | kling-v2-standard-avatar | Lifelike video avatars with precise lip synchronization. | Image-to-Video Generation | $2.779702592592593 | 498.74452s | [llms.txt](https://www.segmind.com/models/kling-v2-standard-avatar/llms.txt) | | Kling bloombloom | kling-bloombloom | Kling AI transforms text and images into dynamic, high-quality video content with realistic motion and sound. | Image-to-Video Generation | $0.9800000000000001 | 193.06476s | [llms.txt](https://www.segmind.com/models/kling-bloombloom/llms.txt) | | Kling dizzydizzy | kling-dizzydizzy | Kling DizzyDizzy transforms static content into dynamic, high-resolution videos, enhancing engagement and storytelling f | Image-to-Video Generation | $0.9800000000000002 | 199.27494s | [llms.txt](https://www.segmind.com/models/kling-dizzydizzy/llms.txt) | | Kling Expansion | kling-expansion | Unleash dynamic visuals with Kling Expansion! Effortlessly inflate and stretch elements for surreal and captivating effe | Image-to-Video Generation | $0.9500000000000001 | 118.23583s | [llms.txt](https://www.segmind.com/models/kling-expansion/llms.txt) | | Kling fuzzyfuzzy | kling-fuzzyfuzzy | Transform your photos instantly into adorable, plush-toy-like visuals with Kling fuzzyfuzzy effect. | Image-to-Video Generation | $0.9635294117647059 | 132.86003s | [llms.txt](https://www.segmind.com/models/kling-fuzzyfuzzy/llms.txt) | | Kling Heart Gesture | kling-heart-gesture | Express affection visually with Kling AI's heart gesture effect! Input two portraits and instantly create heartwarming v | Image-to-Video Generation | $0.945 | 208.38736s | [llms.txt](https://www.segmind.com/models/kling-heart-gesture/llms.txt) | | Kling Hug | kling-hug | Create heartwarming videos instantly with Kling hug effect! Generate tender embracing animations. | Image-to-Video Generation | $0.9800000000000001 | 206.59058s | [llms.txt](https://www.segmind.com/models/kling-hug/llms.txt) | | Kling Kiss | kling-kiss | Create a heartfelt video in seconds with Kling kiss effect! Input two portraits and instantly generate a kissing animati | Image-to-Video Generation | $1.0006434316353892 | 235.845s | [llms.txt](https://www.segmind.com/models/kling-kiss/llms.txt) | | Kling O1 Image 2 Video | kling-o1-image-to-video | Physics-driven animations from images for creative storytelling. | Image-to-Video Generation | $0.9085748965517243 | 143.44042s | [llms.txt](https://www.segmind.com/models/kling-o1-image-to-video/llms.txt) | | Kling O1 Reference Image 2 Video | kling-o1-reference-image-to-video | Identity-preserving videos from static images with character reference. | Image-to-Video Generation | $1.345300374531835 | 197.35832s | [llms.txt](https://www.segmind.com/models/kling-o1-reference-image-to-video/llms.txt) | | Kling O3 Image To Video | kling-o3-image2video | Images to cinematic videos with precise motion control. | Image-to-Video Generation | $1.8423107569721113 | 202.45185s | [llms.txt](https://www.segmind.com/models/kling-o3-image2video/llms.txt) | | Kling Squish | kling-squish | Transform your visuals with Kling AI squish effect! Easily compress and distort images/videos for playful, exaggerated e | Image-to-Video Generation | $0.967272727272727 | 120.28359s | [llms.txt](https://www.segmind.com/models/kling-squish/llms.txt) | | Kling V1 Pro AI Avatar | kling-v1-pro-ai-avatar | Dynamic AI avatars with synchronized speech from image. | Image-to-Video Generation | $4.475115916666667 | 732.24765s | [llms.txt](https://www.segmind.com/models/kling-v1-pro-ai-avatar/llms.txt) | | Kling V1 Standard AI Avatar | kling-v1-standard-ai-avatar | Lifelike AI avatars with precise lip-sync for presentations. | Image-to-Video Generation | $2.2834377333333333 | 460.07325s | [llms.txt](https://www.segmind.com/models/kling-v1-standard-ai-avatar/llms.txt) | | Kling V2 Pro Avatar | kling-v2-pro-avatar | Talking avatar videos from image and audio, high quality. | Image-to-Video Generation | $5.285169729729731 | 779.46664s | [llms.txt](https://www.segmind.com/models/kling-v2-pro-avatar/llms.txt) | | Live Portrait | live-portrait | Live Portrait animates static images using a reference driving video through implicit key point based framework, bringin | Image-to-Video Generation | $0.054998455614000394 | 36.03831s | [llms.txt](https://www.segmind.com/models/live-portrait/llms.txt) | | Live Portrait video to video | live-portrait-video-to-video | Experience the magic of Live Portrait’s Video-to-Video Model! Transform your static images into dynamic videos seamlessl | Image-to-Video Generation | $0.27947058954138704 | 74.44037s | [llms.txt](https://www.segmind.com/models/live-portrait-video-to-video/llms.txt) | | LTX 2 Fast | ltx-2-fast | Fast, high-quality text-to-video generation by Lightricks. | Image-to-Video Generation | $0.5186933065217391 | 46.42703s | [llms.txt](https://www.segmind.com/models/ltx-2-fast/llms.txt) | | LTX 2 Pro | ltx-2-pro | High-quality video generation with advanced motion control. | Image-to-Video Generation | $0.6267278331034484 | 69.73345s | [llms.txt](https://www.segmind.com/models/ltx-2-pro/llms.txt) | | LTX Video | ltx-video | LTX-Video is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produ | Image-to-Video Generation | $0.057473419067595385 | 54.94509s | [llms.txt](https://www.segmind.com/models/ltx-video/llms.txt) | | Luma Image-to-Video | luma-img-2-video | With Luma's Dream Machine, transform your static images into dynamic videos. It offers high-fidelity video generation, r | Image-to-Video Generation | $0.9472327964860999 | 60.84224s | [llms.txt](https://www.segmind.com/models/luma-img-2-video/llms.txt) | | Luma Modify Video | modify-video | Transform videos seamlessly with high-fidelity generative edits while preserving original actor performances. | Image-to-Video Generation | $0.6006692185185185 | 164.37768s | [llms.txt](https://www.segmind.com/models/modify-video/llms.txt) | | Luma Ray flash 2 (720p) | ray-flash-2-720p | Generate stunning 720p videos from text with the Luma ray-flash-2-720p model. Faster & cheaper than Ray 2, offering real | Image-to-Video Generation | $0.4267106325842696 | 60.61618s | [llms.txt](https://www.segmind.com/models/ray-flash-2-720p/llms.txt) | | Luma Ray Image to Video | luma-ray-img-2-video | With Luma's Ray2 image-to-video, transform your static images into cinematic dynamic videos. | Image-to-Video Generation | $1.6000000000000005 | 125.16943s | [llms.txt](https://www.segmind.com/models/luma-ray-img-2-video/llms.txt) | | Minimax (Hailuo) Video-01-live | minimax-ai-live | Create stunning animations with Minimax (Hailuo) video-01-live, an AI image-to-video model perfect for Live2D, anime, an | Image-to-Video Generation | $0.625 | 167.32884s | [llms.txt](https://www.segmind.com/models/minimax-ai-live/llms.txt) | | MiniMax AI (Hailuo) | minimax-ai | With Video-01 by MiniMax, create high-definition videos at 720p resolution and 25fps, featuring cinematic camera movemen | Image-to-Video Generation | $0.6009749412252146 | 177.54842s | [llms.txt](https://www.segmind.com/models/minimax-ai/llms.txt) | | Minimax Hailou 2 | minimax-hailuo-2 | Generate breathtaking 1080P cinematic videos from text or images with ultra-realistic motion and physics. | Image-to-Video Generation | $0.3875000000000001 | 174.24651s | [llms.txt](https://www.segmind.com/models/minimax-hailuo-2/llms.txt) | | Motion Control SVD | motionctrl-svd | Motion Control SVD is an innovative deep learning framework that breathes life into static images. By intelligently mana | Image-to-Video Generation | $0.08928216191843766 | 62.73815s | [llms.txt](https://www.segmind.com/models/motionctrl-svd/llms.txt) | | Muscle Surge | muscle-surge | Instantly add muscle and strength to your videos with Pixverse Muscle Surge effect! | Image-to-Video Generation | $0.41835106382978726 | 45.50548s | [llms.txt](https://www.segmind.com/models/muscle-surge/llms.txt) | | OVI Image To Video | ovi-i2v | Synchronized video and audio generation from text and images. | Image-to-Video Generation | $0.24981727088122607 | 41.92037s | [llms.txt](https://www.segmind.com/models/ovi-i2v/llms.txt) | | Pixverse 4.5 Effects | pixverse-4.5-effects | PixVerse 4.5 transforms photos and text into stunning animated videos for impactful storytelling and marketing. | Image-to-Video Generation | $0.3975694444444444 | 46.91432s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-effects/llms.txt) | | Pixverse 4.5 Transition | pixverse-4.5-transition | PixVerse 4.5 transforms still images into dynamic, captivating videos with seamless transitions. | Image-to-Video Generation | $0.5813148788927336 | 57.82971s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-transition/llms.txt) | | Pixverse 4.5 Video | pixverse-4.5-video | Pixverse 4.5 transforms static images and text into dynamic, engaging videos for captivating social media content. | Image-to-Video Generation | $0.5827814569536424 | 45.92084s | [llms.txt](https://www.segmind.com/models/pixverse-4.5-video/llms.txt) | | Pixverse 5 Extend | pixverse-5-extend | Seamlessly extend and continue AI-generated videos. | Image-to-Video Generation | $0.6569148936170213 | 93.2064s | [llms.txt](https://www.segmind.com/models/pixverse-5-extend/llms.txt) | | Pixverse 5 Transition | pixverse-5-transition | Seamless AI-generated video transitions between scenes. | Image-to-Video Generation | $0.5 | 72.51908s | [llms.txt](https://www.segmind.com/models/pixverse-5-transition/llms.txt) | | Pixverse 5 Video | pixverse-5-video | Cinematic videos from text and images with photorealism. | Image-to-Video Generation | $0.631159420289855 | 68.81511s | [llms.txt](https://www.segmind.com/models/pixverse-5-video/llms.txt) | | Pixverse Image to Video | pixverse-image2video | Animate your photos effortlessly with Pixverse Image to Video AI! Upload, add motion prompts and styles. | Image-to-Video Generation | $0.5479166666666667 | 40.69088s | [llms.txt](https://www.segmind.com/models/pixverse-image2video/llms.txt) | | Pixverse Transition | pixverse-transition | PixVerse V4 transforms static images and text into dynamic, visually stunning videos for creators across various industr | Image-to-Video Generation | $0.8727272727272727 | 57.65282s | [llms.txt](https://www.segmind.com/models/pixverse-transition/llms.txt) | | Pixverse V6 | pixverse-v6 | 15-second AI videos with native audio and cinematic controls. | Image-to-Video Generation | $0.42328571428571427 | 63.92565s | [llms.txt](https://www.segmind.com/models/pixverse-v6/llms.txt) | | Runway Gen 4 Turbo | runway-gen4-turbo | Generate videos faster and cheaper with Runway Gen-4 Turbo! Create high-quality text, image, and combined video generati | Image-to-Video Generation | $0.7741486399657315 | 38.03304s | [llms.txt](https://www.segmind.com/models/runway-gen4-turbo/llms.txt) | | Runway Gen Alpha Turbo Image to Video | runway-gen3-alphaturbo | Runway Gen-3 AlphaTurbo is a cutting-edge AI tool that transforms static images into dynamic videos with exceptional fid | Image-to-Video Generation | $0.5561580559321183 | 27.38178s | [llms.txt](https://www.segmind.com/models/runway-gen3-alphaturbo/llms.txt) | | SadTalker | sadtalker | Audio-based Lip Synchronization for Talking Head Video | Image-to-Video Generation | $0.17811345022209485 | 108.78657s | [llms.txt](https://www.segmind.com/models/sadtalker/llms.txt) | | Seedance 1.0 lite i2v | seedance-v1-lite-image-to-video | Seedance 1.0 transforms text and images into engaging 720p dynamic videos with cinematic storytelling. | Image-to-Video Generation | $0.10684891837037773 | 39.49462s | [llms.txt](https://www.segmind.com/models/seedance-v1-lite-image-to-video/llms.txt) | | Seedance 1.0 Pro | seedance-pro | Seedance Pro transforms text and images into engaging 720p dynamic videos with cinematic storytelling. | Image-to-Video Generation | $0.3520381860174778 | 62.21196s | [llms.txt](https://www.segmind.com/models/seedance-pro/llms.txt) | | Seedance 1.0 Pro Fast | seedance-1.0-pro-fast | Cinematic videos from text and images at ultra speed. | Image-to-Video Generation | $0.2462202225806452 | 48.66931s | [llms.txt](https://www.segmind.com/models/seedance-1.0-pro-fast/llms.txt) | | Seedance 1.5 Pro | seedance-1.5-pro | Synchronized video and audio generation for dynamic storytelling. | Image-to-Video Generation | $0.4061963611901681 | 99.53297s | [llms.txt](https://www.segmind.com/models/seedance-1.5-pro/llms.txt) | | Seedance 2.0 | seedance-2.0 | Cinematic AI videos with native audio and multi-shot narratives. | Image-to-Video Generation | $1.2124749569620252 | 189.30402s | [llms.txt](https://www.segmind.com/models/seedance-2.0/llms.txt) | | Seedance 2.0 Fast | seedance-2.0-fast | Professional-grade video creation model with native audio, similar to SeeDance 2.0 but faster and cheaper. | Image-to-Video Generation | $0.7688965999999999 | 116.85878s | [llms.txt](https://www.segmind.com/models/seedance-2.0-fast/llms.txt) | | Sora 2 | sora-2 | Stunning dynamic videos from detailed text descriptions. | Image-to-Video Generation | $1.015919811320755 | 178.74569s | [llms.txt](https://www.segmind.com/models/sora-2/llms.txt) | | Sora 2 Pro | sora-2-pro | Cinematic-quality videos from text with temporal consistency. | Image-to-Video Generation | $3.135135135135134 | 400.42705s | [llms.txt](https://www.segmind.com/models/sora-2-pro/llms.txt) | | Stable Video Diffusion | svd | Takes image as input and returns a video. | Image-to-Video Generation | $0.165895590295991 | 29.6271s | [llms.txt](https://www.segmind.com/models/svd/llms.txt) | | Tooncrafter | tooncrafter | Create videos from illustrated input images | Image-to-Video Generation | $0.12288504435102474 | 108.41335s | [llms.txt](https://www.segmind.com/models/tooncrafter/llms.txt) | | V Express | v-express | V-Express lets you create portrait videos from single images. | Image-to-Video Generation | $0.26351073672376873 | 196.28656s | [llms.txt](https://www.segmind.com/models/v-express/llms.txt) | | Veo 3.1 | veo-3.1 | Static images into high-quality videos with synchronized audio. | Image-to-Video Generation | $2.161127895266869 | 110.49847s | [llms.txt](https://www.segmind.com/models/veo-3.1/llms.txt) | | Veo 3.1 Fast | veo-3.1-fast | Fast image-to-video at 1080p with native audio. | Image-to-Video Generation | $0.8591154018859455 | 99.31184s | [llms.txt](https://www.segmind.com/models/veo-3.1-fast/llms.txt) | | Veo 3.1 Lite | veo-3.1-lite | Affordable text-to-video with audio, powered by Google. | Image-to-Video Generation | $0.8009259259259259 | 49.51556s | [llms.txt](https://www.segmind.com/models/veo-3.1-lite/llms.txt) | | Video Faceswap | videofaceswap | Video Faceswap is a powerful tool for creators, filmmakers, and meme enthusiasts. With this innovative technology, you | Image-to-Video Generation | $0.4118847845862569 | 184.88331s | [llms.txt](https://www.segmind.com/models/videofaceswap/llms.txt) | | Video Frame Interpolation | video-frame-interpolation | FILM synthesizes smooth, high-quality intermediate frames for fluid motion in videos with significant movement. | Image-to-Video Generation | $5 | - | [llms.txt](https://www.segmind.com/models/video-frame-interpolation/llms.txt) | | Video Stitch | video-stitch | Revolutionize your video editing with the Video Stitch Model. Seamlessly stitch clips, add captivating audio, and create | Image-to-Video Generation | $0.002885776086843744 | 30.38754s | [llms.txt](https://www.segmind.com/models/video-stitch/llms.txt) | | Video Tryon | video-tryon | Video Tryon is Segmind’s next-generation AI video model for instant virtual try-on, allowing users to visualize any outf | Image-to-Video Generation | $1.6059995424855493 | 205.12761s | [llms.txt](https://www.segmind.com/models/video-tryon/llms.txt) | | Video Watermark Remover | video-watermark-remover | Remove watermarks from any video instantly with AI. | Image-to-Video Generation | $0.8275678268292683 | 194.82s | [llms.txt](https://www.segmind.com/models/video-watermark-remover/llms.txt) | | Vidu Q1 Reference to Video | vidu-q1-reference-to-video | Vidu AI reference to video transforms text and images into dynamic, high-quality videos effortlessly. | Image-to-Video Generation | $0.5 | 120.68598s | [llms.txt](https://www.segmind.com/models/vidu-q1-reference-to-video/llms.txt) | | Vidu Template | vidu-template | Transform static images into captivating videos using diverse motion templates effortlessly. | Image-to-Video Generation | $0.0625 | 125.76909s | [llms.txt](https://www.segmind.com/models/vidu-template/llms.txt) | | Wan 2.1 480p image to video | wan2.1-i2v-480p | Create high-quality 480p videos with excellent visual quality and a broad spectrum of motion from static images. | Image-to-Video Generation | $0.5302190763528138 | 53.38121s | [llms.txt](https://www.segmind.com/models/wan2.1-i2v-480p/llms.txt) | | Wan 2.1 720p image to video | wan2.1-i2v-720p | Create high-quality 720p videos with excellent visual quality and a broad spectrum of motion from static images. | Image-to-Video Generation | $1.4477480769691784 | 148.71608s | [llms.txt](https://www.segmind.com/models/wan2.1-i2v-720p/llms.txt) | | Wan 2.2 Image to Video Fast | wan-2.2-i2v-fast | Transforms simple text prompts into breathtaking cinematic-quality videos in minutes. | Image-to-Video Generation | $0.08774838350014805 | 52.92964s | [llms.txt](https://www.segmind.com/models/wan-2.2-i2v-fast/llms.txt) | | Wan 2.2 Image to Video Flash | wan-2.2-i2v-flash | Convert a single image into a coherent dynamic video. | Image-to-Video Generation | $0.1722461538461539 | 66.87629s | [llms.txt](https://www.segmind.com/models/wan-2.2-i2v-flash/llms.txt) | | Wan 2.5 Image to Video | wan-2.5-i2v | Wan2.5-Preview creates stunning, high-resolution videos with flawless audio synchronization from multiple inputs. | Image-to-Video Generation | $0.8588397896076353 | 177.58862s | [llms.txt](https://www.segmind.com/models/wan-2.5-i2v/llms.txt) | | Wan 2.6 Image To Video | wan-2.6-i2v | Transform images into high-quality videos with audio sync. | Image-to-Video Generation | $1.1835867724233984 | 137.1875s | [llms.txt](https://www.segmind.com/models/wan-2.6-i2v/llms.txt) | | Wan 2.6 Text To Video | wan-2.6-t2v | Cinematic videos with synchronized audio from text prompts. | Image-to-Video Generation | $1.0112359550561798 | 180.35542s | [llms.txt](https://www.segmind.com/models/wan-2.6-t2v/llms.txt) | | Wan 2.7 Image to Video | wan2.7-i2v | Animate any image into cinematic 1080P video with audio. | Image-to-Video Generation | $0.703125 | 483.81044s | [llms.txt](https://www.segmind.com/models/wan2.7-i2v/llms.txt) | | Wan 2.7 Reference to Video | wan2.7-r2v | Character-consistent multi-subject videos from reference images. | Image-to-Video Generation | $0.703125 | 490.86746s | [llms.txt](https://www.segmind.com/models/wan2.7-r2v/llms.txt) | | Wan Animate | wan-animate | Animate characters and replace video subjects seamlessly. | Image-to-Video Generation | $1.5690977389090908 | 412.48196s | [llms.txt](https://www.segmind.com/models/wan-animate/llms.txt) | | Wan Scail | scail | Professional character animations from reference images. | Image-to-Video Generation | $1.9292771716129034 | 429.39176s | [llms.txt](https://www.segmind.com/models/scail/llms.txt) | | Wan Video Effects | video-effects | Transform your videos with diverse video effects. Start creating captivating videos today. | Image-to-Video Generation | $0.5248923280952382 | 126.12837s | [llms.txt](https://www.segmind.com/models/video-effects/llms.txt) | | Warmth of Jesus | warmth-of-jesus | Experience the viral "Warmth of Jesus" effect on PixVerse! Transform your images into heartwarming videos of Jesus embra | Image-to-Video Generation | $0.3907894736842105 | 50.30539s | [llms.txt](https://www.segmind.com/models/warmth-of-jesus/llms.txt) | ## videoToVideo | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Bria Increase Video Resolution | bria-increase-video-resolution | Transform your videos with AI-powered upscaling and seamless background removal for professional quality. | videoToVideo | $1.0636303030303031 | 192.3339s | [llms.txt](https://www.segmind.com/models/bria-increase-video-resolution/llms.txt) | | Bria Remove Video Background | bria-remove-video-background | Bria Video AI enhances videos up to 8K while seamlessly removing backgrounds for professional quality content. | videoToVideo | $2.150271232876712 | 41.1538s | [llms.txt](https://www.segmind.com/models/bria-remove-video-background/llms.txt) | | Bria Video Eraser | bria-erase-video | Remove unwanted objects from videos while preserving audio. | videoToVideo | $0.3192 | 120.7891s | [llms.txt](https://www.segmind.com/models/bria-erase-video/llms.txt) | | Esrgan Video Upscaler | esrgan-video-upscaler | ESRGAN Video Upscaler: Experience sharper, clearer 4k videos with ESRGAN. This AI-powered video upscaler boosts resoluti | videoToVideo | $0.3217835050269301 | 157.28935s | [llms.txt](https://www.segmind.com/models/esrgan-video-upscaler/llms.txt) | | FlashVSR | flashvsr | Real-time video quality enhancement for high-resolution content. | videoToVideo | $1.1762287625 | 163.0629s | [llms.txt](https://www.segmind.com/models/flashvsr/llms.txt) | | Heygen Video Translate | heygen-video-translate | Translate videos to multiple languages with natural lip-sync. | videoToVideo | $0.48275438235294116 | 169.60772s | [llms.txt](https://www.segmind.com/models/heygen-video-translate/llms.txt) | | Kling 2.6 Pro Motion Control | kling-2.6-pro-motion-control | Transfer motion from videos to animate custom characters. | videoToVideo | $1.7247032258064519 | 595.48319s | [llms.txt](https://www.segmind.com/models/kling-2.6-pro-motion-control/llms.txt) | | Kling 2.6 Standard Motion Control | kling-2.6-standard-motion-control | Precise motion transfer from reference videos to characters. | videoToVideo | $0.9415 | 575.55155s | [llms.txt](https://www.segmind.com/models/kling-2.6-standard-motion-control/llms.txt) | | Kling O1 Video 2 Video Edit | kling-o1-video-to-video-edit | Edit any video with precise natural language commands. | videoToVideo | $1.2769852941176472 | 250.29267s | [llms.txt](https://www.segmind.com/models/kling-o1-video-to-video-edit/llms.txt) | | Kling O1 Video 2 Video Reference | kling-o1-video-to-video-reference | Video style transfer using reference character images. | videoToVideo | $1.423617391304348 | 239.7181s | [llms.txt](https://www.segmind.com/models/kling-o1-video-to-video-reference/llms.txt) | | Kling O3 Video To Video Edit | kling-o3-video2video-edit | Text-based video editor — swap backgrounds, characters, restyle scenes. | videoToVideo | $2.3126249999999997 | 202.43342s | [llms.txt](https://www.segmind.com/models/kling-o3-video2video-edit/llms.txt) | | Kling O3 Video To Video Reference | kling-o3-video2video-reference | Swap characters and restyle videos using reference images. | videoToVideo | $2.032916666666667 | 224.65349s | [llms.txt](https://www.segmind.com/models/kling-o3-video2video-reference/llms.txt) | | LTX Retake Video | ltx-retake-video | Precise segment-level video edits maintaining full scene continuity. | videoToVideo | $0.7911607142857141 | 38.5012s | [llms.txt](https://www.segmind.com/models/ltx-retake-video/llms.txt) | | Multi Video Merge | multi-video-merge | Merge multiple videos into a single combined output. | videoToVideo | $0.03265760629629629 | 98.33976s | [llms.txt](https://www.segmind.com/models/multi-video-merge/llms.txt) | | Pixverse Lipsync | pixverse-lipsync | PixVerse Lipsync expertly synchronizes lip movements to audio for flawless video content creation. | videoToVideo | $0.31236277056277056 | 112.43345s | [llms.txt](https://www.segmind.com/models/pixverse-lipsync/llms.txt) | | Runway Gen4 Aleph | runway-gen4-aleph | Runway Aleph revolutionizes video editing with intelligent automation for seamless object and environment manipulation. | videoToVideo | $1.125 | 171.53096s | [llms.txt](https://www.segmind.com/models/runway-gen4-aleph/llms.txt) | | Sam V2 Video | sam-v2-video | SAM v2 Video by Meta AI, allows promptable segmentation of objects in videos. | videoToVideo | $0.05689024780269058 | 37.56451s | [llms.txt](https://www.segmind.com/models/sam-v2-video/llms.txt) | | Sam3 Video | sam3-video | Real-time video segmentation and multi-object tracking. | videoToVideo | $0.13449561599999998 | 117.44927s | [llms.txt](https://www.segmind.com/models/sam3-video/llms.txt) | | Sync.so Lipsync 2 Pro | sync.so-lipsync-2-pro | Lipsync-2-Pro seamlessly synchronizes lips in videos for instant, high-quality multilingual content creation. | videoToVideo | $1.0556461855670103 | 234.90347s | [llms.txt](https://www.segmind.com/models/sync.so-lipsync-2-pro/llms.txt) | | Sync.so React 1 | sync.so-react-1 | Edit video actors' emotions with realistic re-expression. | videoToVideo | $1.9782141935483872 | 347.64718s | [llms.txt](https://www.segmind.com/models/sync.so-react-1/llms.txt) | | Topaz Labs Video Upscale | topaz-video-upscale | Topaz Video AI upscales, enhances, denoises, stabilizes, and increases frame rates in video footage, transforming low-qu | videoToVideo | $1.6011588831683166 | 212.58892s | [llms.txt](https://www.segmind.com/models/topaz-video-upscale/llms.txt) | | Video Audio Merge | video-audio-merge | Effortlessly merge audio and video with our intuitive Video Audio Merge model. Create stunning multimedia content with p | videoToVideo | $0.0017616686451116241 | 20.1118s | [llms.txt](https://www.segmind.com/models/video-audio-merge/llms.txt) | | Video Captioner | video-captioner | With Video Captioner create accurate, customizable subtitles for your videos effortlessly. | videoToVideo | $0.03830279902865469 | 68.54137s | [llms.txt](https://www.segmind.com/models/video-captioner/llms.txt) | | Video Concatenate | video-concatenate | Merge videos with custom layouts, spacing, and audio. | videoToVideo | $0.0007790461538461538 | 31.6725s | [llms.txt](https://www.segmind.com/models/video-concatenate/llms.txt) | | Video Loop | video-loop | Effortlessly loop videos for engaging social media & storytelling with our Video Loop. | videoToVideo | $0.0009503033488372096 | 8.14397s | [llms.txt](https://www.segmind.com/models/video-loop/llms.txt) | | Wan 2.7 Video Editing | wan2.7-videoedit | Edit existing videos precisely using natural language text instructions. | videoToVideo | $0.6433823529411765 | 362.14922s | [llms.txt](https://www.segmind.com/models/wan2.7-videoedit/llms.txt) | ## Text-to-Video Generation | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Cog Video X 5B | cog-video-5b-t2v | CogVideo is a groundbreaking AI model that turns text into high-quality videos. Create realistic scenes, animations, and | Text-to-Video Generation | $0.3555572362200434 | 229.23271s | [llms.txt](https://www.segmind.com/models/cog-video-5b-t2v/llms.txt) | | Google Veo 2 | veo-2 | Create stunning, realistic videos with Veo 2, Google's state-of-the-art AI video generation model. Experience enhanced q | Text-to-Video Generation | $4.2576142889376225 | 39.87097s | [llms.txt](https://www.segmind.com/models/veo-2/llms.txt) | | Google Veo 3 | veo-3 | Veo 3 revolutionizes video creation with advanced text-to-video generation and realistic audio synthesis for cinematic c | Text-to-Video Generation | $5.188406933524204 | 144.36327s | [llms.txt](https://www.segmind.com/models/veo-3/llms.txt) | | Hunyuan Video | hunyuan-video | Hunyuan AI Video is a new, state of the art, AI Video Generator that creates high-quality videos from text descriptions. | Text-to-Video Generation | $1.4642586306485041 | 211.32017s | [llms.txt](https://www.segmind.com/models/hunyuan-video/llms.txt) | | Kling 3.0 Pro Text-to-Video | kling-3-pro-text2video | Cinematic 1080p videos with realistic audio from text. | Text-to-Video Generation | $3.3089230769230773 | 297.22698s | [llms.txt](https://www.segmind.com/models/kling-3-pro-text2video/llms.txt) | | Kling 3.0 Standard Text-to-Video | kling-3-standard-text2video | Stunning 1080p cinematic videos from simple text prompts. | Text-to-Video Generation | $1.7481509433962263 | 145.32821s | [llms.txt](https://www.segmind.com/models/kling-3-standard-text2video/llms.txt) | | Kling AI 1.6 Text to Video | kling-1.6-text2video | Kling AI 1.6 Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create profess | Text-to-Video Generation | $0.6886923076923093 | 304.07888s | [llms.txt](https://www.segmind.com/models/kling-1.6-text2video/llms.txt) | | Kling AI Text to Video | kling-text2video | Kling AI Text-to-Video is a cutting-edge AI tool that transforms text into stunning, lifelike videos. Create professiona | Text-to-Video Generation | $0.43168958742632646 | 320.23594s | [llms.txt](https://www.segmind.com/models/kling-text2video/llms.txt) | | Kling O3 Text-to-Video | kling-o3-text2video | 15-second cinematic AI videos with native audio. | Text-to-Video Generation | $2.342307692307692 | 153.52772s | [llms.txt](https://www.segmind.com/models/kling-o3-text2video/llms.txt) | | LTX-2-19B I2V | ltx-2-19b-i2v | Synchronized 4K audio-video generation from images, fast. | Text-to-Video Generation | $0.4551744523809522 | 113.17079s | [llms.txt](https://www.segmind.com/models/ltx-2-19b-i2v/llms.txt) | | LTX-2-19B T2V | ltx-2-19b-t2v | Synchronized video and audio from text, multiple input types. | Text-to-Video Generation | $0.40987834111111116 | 100.94594s | [llms.txt](https://www.segmind.com/models/ltx-2-19b-t2v/llms.txt) | | Luma Ray Text to Video | luma-ray-txt-2-video | Luma Ray2 text-to-video creates realistic, coherent videos from your text prompts. | Text-to-Video Generation | $1.599999999999998 | 114.21931s | [llms.txt](https://www.segmind.com/models/luma-ray-txt-2-video/llms.txt) | | Luma Text-to-Video | luma-txt-2-video | Luma Video (Text to Video) is an advanced AI model that turns text prompts into captivating videos. Designed for creator | Text-to-Video Generation | $0.9470967741935508 | 62.45225s | [llms.txt](https://www.segmind.com/models/luma-txt-2-video/llms.txt) | | Minimax AI Director | minimax-ai-director | Minimax video-01-director: Create high-quality videos with control camera movements precisely using text prompts. | Text-to-Video Generation | $0.625 | 154.48488s | [llms.txt](https://www.segmind.com/models/minimax-ai-director/llms.txt) | | Mochi 1 | mochi-1 | Mochi 1 is a cutting-edge, open-source AI model that transforms text prompts into stunning, high-fidelity videos. Create | Text-to-Video Generation | $0.26644555832420574 | 180.10768s | [llms.txt](https://www.segmind.com/models/mochi-1/llms.txt) | | Pixverse Text to Video | pixverse-text2video | Effortlessly create captivating videos from text with Pixverse text to video AI! Customize style, duration, and more. | Text-to-Video Generation | $0.4281746031746032 | 44.22598s | [llms.txt](https://www.segmind.com/models/pixverse-text2video/llms.txt) | | Seedance 1.0 lite t2v | seedance-v1-lite-text-to-video | Seedance V1 Lite transforms text into high-quality videos, streamlining content creation for diverse applications. | Text-to-Video Generation | $0.19834662576687118 | 46.71624s | [llms.txt](https://www.segmind.com/models/seedance-v1-lite-text-to-video/llms.txt) | | Veo 3 Fast | veo-3-fast | Veo 3 Fast rapidly creates high-quality, 8-second videos with synchronized audio for diverse content needs. | Text-to-Video Generation | $1.6233309404163676 | 80.4446s | [llms.txt](https://www.segmind.com/models/veo-3-fast/llms.txt) | | Wan 2.2 Text to Video Fast | wan-2.2-t2v-fast | Wan2.2 transforms text and images into high-quality video clips with cinematic flair. | Text-to-Video Generation | $0.09856787724935734 | 96.20325s | [llms.txt](https://www.segmind.com/models/wan-2.2-t2v-fast/llms.txt) | | Wan 2.5 Text to Video | wan-2.5-t2v | Wan2.5-Preview generates synchronized multimedia content, merging text, image, video, and audio seamlessly. | Text-to-Video Generation | $0.801146643572621 | 213.47998s | [llms.txt](https://www.segmind.com/models/wan-2.5-t2v/llms.txt) | | Wan 2.7 Text to Video | wan2.7-t2v | 1080P cinematic videos with audio sync and multi-shot control. | Text-to-Video Generation | $0.7589285714285714 | 302.8468s | [llms.txt](https://www.segmind.com/models/wan2.7-t2v/llms.txt) | | Wan_2.1 Text to Video | wan2.1-t2v | Create visually impressive and feature varied, lifelike motion videos with Wan2.1 using text prompts. | Text-to-Video Generation | $0.8552705145173745 | 104.81362s | [llms.txt](https://www.segmind.com/models/wan2.1-t2v/llms.txt) | ## Text-to-Image Generation | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Background Eraser | background-eraser | Background Eraser helps in flawless background removal with exceptional accuracy. | Text-to-Image Generation | $0.0006151452871512006 | 0.79262s | [llms.txt](https://www.segmind.com/models/background-eraser/llms.txt) | | Bria 3.2 Text to Image | bria-text-to-image | Bria 3.2 AI transforms natural language into stunning visuals for diverse creative applications — with Base, Fast, and H | Text-to-Image Generation | $0.03890776699029126 | 21.90631s | [llms.txt](https://www.segmind.com/models/bria-text-to-image/llms.txt) | | Bria Vector Graphics | bria-text-to-vector-graphics | Bria Vision enables high-quality text-to-image and text-to-vector graphic generation for versatile commercial use. | Text-to-Image Generation | $0.03927536231884058 | 17.91585s | [llms.txt](https://www.segmind.com/models/bria-text-to-vector-graphics/llms.txt) | | Chroma | chroma | Chroma is an open-source, 8.9B parameter text-to-image model (based on FLUX.1-schnell) designed for diverse and uncensor | Text-to-Image Generation | $0.05537459326497977 | 53.18714s | [llms.txt](https://www.segmind.com/models/chroma/llms.txt) | | Colossus Lightning SDXL | sdxl1.0-colossus-lightning | Colossus Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.007343366528711841 | 3.41171s | [llms.txt](https://www.segmind.com/models/sdxl1.0-colossus-lightning/llms.txt) | | Copax Timeless SDXL | sdxl1.0-timeless | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.010301487864269121 | 5.4494s | [llms.txt](https://www.segmind.com/models/sdxl1.0-timeless/llms.txt) | | Cyber Realistic | sd1.5-cyberrealistic | The most versatile photorealistic model that blends various models to achieve the amazing realistic images. | Text-to-Image Generation | $0.002858647598761479 | 1.49894s | [llms.txt](https://www.segmind.com/models/sd1.5-cyberrealistic/llms.txt) | | DreamShaper Lightning SDXL | sdxl1.0-dreamshaper-lightning | DreamShaper Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px image | Text-to-Image Generation | $0.006108172840899091 | 3.18294s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dreamshaper-lightning/llms.txt) | | Dreamshaper SDXL | sdxl1.0-dreamshaper | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.011029928226258858 | 6.46014s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dreamshaper/llms.txt) | | Dynavis Lightning SDXL | sdxl1.0-dyanvis-lightning | Dynavis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.006759038098416354 | 3.81495s | [llms.txt](https://www.segmind.com/models/sdxl1.0-dyanvis-lightning/llms.txt) | | Edge of Realism | sd1.5-edgeofrealism | This model corresponds to the Stable Diffusion Edge of Realism checkpoint for detailed images at the cost of a super det | Text-to-Image Generation | $0.0033575167818571824 | 1.62471s | [llms.txt](https://www.segmind.com/models/sd1.5-edgeofrealism/llms.txt) | | Epic Realism | sd1.5-epicrealism | This model corresponds to the Stable Diffusion Epic Realism checkpoint for detailed images at the cost of a super detail | Text-to-Image Generation | $0.0035588267762241533 | 1.66889s | [llms.txt](https://www.segmind.com/models/sd1.5-epicrealism/llms.txt) | | Fast Flux.1 Schnell | fast-flux-schnell | Fast Flux.1 Schnell by Segmind is an optimized text-to-image model designed for developers needing faster image generati | Text-to-Image Generation | $0.005469704305806988 | 2.45278s | [llms.txt](https://www.segmind.com/models/fast-flux-schnell/llms.txt) | | Flux .1 Pro | flux-pro | Flux Pro is a state-of-the-art image generation with top of the line prompt following, visual quality, image detail and | Text-to-Image Generation | $0.06720196622390362 | 20.40943s | [llms.txt](https://www.segmind.com/models/flux-pro/llms.txt) | | Flux Dev Finetuned | flux-dev-finetuned | Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.03397461959219858 | 22.17098s | [llms.txt](https://www.segmind.com/models/flux-dev-finetuned/llms.txt) | | Flux Realism Lora with Upscale | flux-realism-lora | Flux Realism Lora with upscale, developed by XLabs AI is a cutting-edge model designed to generate realistic images fro | Text-to-Image Generation | $0.05401312950388659 | 38.20368s | [llms.txt](https://www.segmind.com/models/flux-realism-lora/llms.txt) | | Flux-1.1 Pro Ultra | flux-1.1-pro-ultra | Create stunning visuals effortlessly with Flux 1.1 Pro Ultra. Experience unparalleled image quality and speed. | Text-to-Image Generation | $0.07491933421374558 | 13.97401s | [llms.txt](https://www.segmind.com/models/flux-1.1-pro-ultra/llms.txt) | | flux-pro-1.1 | flux-1.1-pro | Flux Pro 1.1 is a cutting-edge image generation tool offering exceptional speed, quality, and customization. Ideal for d | Text-to-Image Generation | $0.049941616963406175 | 14.08806s | [llms.txt](https://www.segmind.com/models/flux-1.1-pro/llms.txt) | | Flux.1 Dev | flux-dev | Flux Dev is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.019617503960528127 | 20.42006s | [llms.txt](https://www.segmind.com/models/flux-dev/llms.txt) | | Flux.1 Schnell | flux-schnell | Flux Schnell  is a state-of-the-art text-to-image generation model engineered for speed and efficiency. | Text-to-Image Generation | $0.007854434602951205 | 10.14296s | [llms.txt](https://www.segmind.com/models/flux-schnell/llms.txt) | | GPT Image 1 | gpt-image-1 | Create high-quality AI-generated images from text prompts using OpenAI's GPT Image 1 model. Ideal for product design, co | Text-to-Image Generation | $0.17882895119128733 | 48.8478s | [llms.txt](https://www.segmind.com/models/gpt-image-1/llms.txt) | | GPT Image 1 Mini | gpt-image-1-mini | High-quality image generation from text, fast and affordable. | Text-to-Image Generation | $0.03726354359483614 | 42.19341s | [llms.txt](https://www.segmind.com/models/gpt-image-1-mini/llms.txt) | | GPT Image 1.5 | gpt-image-1.5 | Stunning photorealistic images with exceptional instruction-following. | Text-to-Image Generation | $0.16893354443934527 | 38.40591s | [llms.txt](https://www.segmind.com/models/gpt-image-1.5/llms.txt) | | Ideogram 2a Text To Image | ideogram-2a-txt-2-img | Create captivating designs, realistic images & innovative logos with Ideogram 2a text-to-image. | Text-to-Image Generation | $0.04999999999999993 | 12.37334s | [llms.txt](https://www.segmind.com/models/ideogram-2a-txt-2-img/llms.txt) | | Ideogram 3.0 | ideogram-3 | Ideogram 3.0 revolutionizes content creation with photorealistic text-to-image generation and diverse aesthetic styles. | Text-to-Image Generation | $0.06539005269245597 | 10.29323s | [llms.txt](https://www.segmind.com/models/ideogram-3/llms.txt) | | Ideogram Text To Image | ideogram-txt-2-img | Ideogram Text to Image: Turn your ideas into stunning visuals instantly with this powerful AI tool. Create captivating d | Text-to-Image Generation | $0.09999999999999962 | 21.63901s | [llms.txt](https://www.segmind.com/models/ideogram-txt-2-img/llms.txt) | | Ideogram Turbo Text To Image | ideogram-turbo-txt-2-img | Create stunning images in seconds with Ideogram Turbo Text to Image. Fast AI model for quick ideation & text rendering. | Text-to-Image Generation | $0.06299999999999999 | 12.83101s | [llms.txt](https://www.segmind.com/models/ideogram-turbo-txt-2-img/llms.txt) | | Imagen 3 | imagen | Imagen 3 is Google DeepMind's highest quality text-to-image model. Generates detailed images with enhanced lighting, div | Text-to-Image Generation | $0.060000000000000074 | 8.15673s | [llms.txt](https://www.segmind.com/models/imagen/llms.txt) | | Imagen 4 | imagen-4 | Imagen 4 is Google’s most advanced AI image generation model, creating detailed, photorealistic or abstract images from | Text-to-Image Generation | $0.059999999999999915 | 11.36461s | [llms.txt](https://www.segmind.com/models/imagen-4/llms.txt) | | Juggernaut Final | sd1.5-juggernaut | The most versatile photorealistic model that blends various models to achieve the amazing realistic images. | Text-to-Image Generation | $0.0030143181960931693 | 1.68559s | [llms.txt](https://www.segmind.com/models/sd1.5-juggernaut/llms.txt) | | Juggernaut Lightning Flux | juggernaut-lightning-flux | Juggernaut Lightning Flux: Blazing fast (<300ms!) & powerful inference with enhanced visuals. | Text-to-Image Generation | $0.009213980160017302 | 6.07418s | [llms.txt](https://www.segmind.com/models/juggernaut-lightning-flux/llms.txt) | | Juggernaut Lightning SDXL | sdxl1.0-juggernaut-lightning | Juggernaut Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images | Text-to-Image Generation | $0.0041748787772685056 | 3.19259s | [llms.txt](https://www.segmind.com/models/sdxl1.0-juggernaut-lightning/llms.txt) | | Juggernaut Pro Flux | juggernaut-pro-flux | Juggernaut Pro FLUX: Create stunningly realistic AI images with unprecedented detail and sharpness. | Text-to-Image Generation | $0.012202491649676834 | 7.66323s | [llms.txt](https://www.segmind.com/models/juggernaut-pro-flux/llms.txt) | | Kling V3 Text to Image | kling-3-text2image | Photorealistic, print-ready images from text prompts. | Text-to-Image Generation | $0.035 | 50.47578s | [llms.txt](https://www.segmind.com/models/kling-3-text2image/llms.txt) | | Luma Photon Flash Text to Image | luma-photon-flash-txt-2-img | Luma Photon flash is a powerful and fast text-to-image model offering high-quality visuals with unmatched speed and prec | Text-to-Image Generation | $0.0024999999999999966 | 15.9032s | [llms.txt](https://www.segmind.com/models/luma-photon-flash-txt-2-img/llms.txt) | | Luma Photon Text to Image | luma-photon-txt-2-img | Luma Photon is a powerful AI-driven text-to-image model offering high-quality visuals with unmatched speed and precision | Text-to-Image Generation | $0.018749999999999996 | 18.87383s | [llms.txt](https://www.segmind.com/models/luma-photon-txt-2-img/llms.txt) | | Nano Banana | nano-banana | Gemini Image Editor preserves authentic subject identity while enabling seamless image editing and manipulation. | Text-to-Image Generation | $0.03635954763730072 | 14.27113s | [llms.txt](https://www.segmind.com/models/nano-banana/llms.txt) | | NewReality Lightning SDXL | sdxl1.0-newreality-lightning | NewReality Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images | Text-to-Image Generation | $0.006337203805577281 | 3.04087s | [llms.txt](https://www.segmind.com/models/sdxl1.0-newreality-lightning/llms.txt) | | NightVis Lightning SDXL | sdxl1.0-nightvis-lightning | NightVis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.007701533149756406 | 4.27571s | [llms.txt](https://www.segmind.com/models/sdxl1.0-nightvis-lightning/llms.txt) | | Playground V2.5 | playground-v2.5 | Playground V2.5 is a diffusion-based text-to-image generative model, designed to create highly aesthetic images based on | Text-to-Image Generation | $0.003721384771350553 | 4.01389s | [llms.txt](https://www.segmind.com/models/playground-v2.5/llms.txt) | | ProtoVision Lightning SDXL | sdxl1.0-protovis-lightning | ProtoVision Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px image | Text-to-Image Generation | $0.0064859676707384704 | 3.63305s | [llms.txt](https://www.segmind.com/models/sdxl1.0-protovis-lightning/llms.txt) | | Pruna P Image | p-image | High-quality text-to-image generation optimized for speed. | Text-to-Image Generation | $0.005 | 6.36064s | [llms.txt](https://www.segmind.com/models/p-image/llms.txt) | | Qwen Image | qwen-image | Qwen-Image revolutionizes image generation and editing with seamless multilingual text integration and photorealistic de | Text-to-Image Generation | $0.12119319123750963 | 28.39442s | [llms.txt](https://www.segmind.com/models/qwen-image/llms.txt) | | Qwen Image 2512 | qwen-image-2512 | Photorealistic image generation with precise text description following. | Text-to-Image Generation | $0.013920287425219943 | 19.16328s | [llms.txt](https://www.segmind.com/models/qwen-image-2512/llms.txt) | | Qwen Image Fast | qwen-image-fast | Qwen-Image expertly generates stunning images with complex text integration, especially for Chinese typography. | Text-to-Image Generation | $0.01795099815209666 | 5.53044s | [llms.txt](https://www.segmind.com/models/qwen-image-fast/llms.txt) | | RealDream Lightning | sdxl1.0-realdream-lightning | RealDream is a sophisticated image generation model utilizing SDXL Lightning architecture. It creates incredibly realist | Text-to-Image Generation | $0.002366460644945139 | 3.02763s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realdream-lightning/llms.txt) | | Realdream Pony V9 | sdxl1.0-realdream-pony-v9 | Real Dream Pony V9 is an advanced image generation model based on the Stable Diffusion XL (SDXL) architecture, excelling | Text-to-Image Generation | $0.007407028161413931 | 5.24506s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realdream-pony-v9/llms.txt) | | Realism Lightning SDXL | sdxl1.0-realism-lightning | Realism Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.007190902261951501 | 4.69687s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realism-lightning/llms.txt) | | Realistic Vision | sd1.5-realisticvision | This model corresponds to the Stable Diffusion Realistic Vision checkpoint for detailed images at the cost of a super de | Text-to-Image Generation | $0.002516299705881904 | 1.46519s | [llms.txt](https://www.segmind.com/models/sd1.5-realisticvision/llms.txt) | | Realvis Lightning SDXL | sdxl1.0-realvis-lightning | Realvis Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in | Text-to-Image Generation | $0.005007959419236132 | 3.0495s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realvis-lightning/llms.txt) | | Realvis SDXL | sdxl1.0-realvis | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.01033161443048845 | 4.9386s | [llms.txt](https://www.segmind.com/models/sdxl1.0-realvis/llms.txt) | | Recraft V3 | recraft-v3 | Recraft V3, the latest iteration of Recraft AI, offers a significant advancement in AI-driven image generation. This sta | Text-to-Image Generation | $0.05000000000000011 | 15.08788s | [llms.txt](https://www.segmind.com/models/recraft-v3/llms.txt) | | Recraft V3 Svg | recraft-v3-svg | Recraft V3 SVG generates high-quality, customizable vector graphics with precision and ease. Perfect for logos, infograp | Text-to-Image Generation | $0.10000000000000009 | 17.66325s | [llms.txt](https://www.segmind.com/models/recraft-v3-svg/llms.txt) | | Reliberate | sd1.5-reliberate | This model corresponds to the Stable Diffusion Reliberate checkpoint for detailed images at the cost of a super detailed | Text-to-Image Generation | $0.003285052131707925 | 1.84194s | [llms.txt](https://www.segmind.com/models/sd1.5-reliberate/llms.txt) | | Samaritan 3D XL | sdxl1.0-samaritan-3d | Samaritan 3D XL leverages the robust capabilities of the SDXL framework, ensuring high-quality, detailed 3D character re | Text-to-Image Generation | $0.008573312925109997 | 4.13346s | [llms.txt](https://www.segmind.com/models/sdxl1.0-samaritan-3d/llms.txt) | | Samaritan Lightning SDXL | sdxl1.0-samaritan-lightning | Samaritan Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images | Text-to-Image Generation | $0.007016612648451131 | 4.37849s | [llms.txt](https://www.segmind.com/models/sdxl1.0-samaritan-lightning/llms.txt) | | Seedream 3.0 t2i | seedream-v3-text-to-image | Seedream V3 generates high-resolution, bilingual images in seconds, enhancing creative workflows and marketing effective | Text-to-Image Generation | $0.0374999999999999 | 5.79861s | [llms.txt](https://www.segmind.com/models/seedream-v3-text-to-image/llms.txt) | | Seedream 5.0 Lite: Text-to-Image | seedream-v5-lite-text-to-image | Fast, affordable instruction-following image generation. | Text-to-Image Generation | $0.03499999999999999 | 36.43476s | [llms.txt](https://www.segmind.com/models/seedream-v5-lite-text-to-image/llms.txt) | | Segmind-Vega | segmind-vega | The Segmind-Vega Model is a distilled version of the Stable Diffusion XL (SDXL), offering a remarkable 70% reduction in | Text-to-Image Generation | $0.002249561540861523 | 2.35792s | [llms.txt](https://www.segmind.com/models/segmind-vega/llms.txt) | | Segmind-VegaRT | segmind-vega-rt-v1 | Segmind-VegaRT a distilled consistency adapter for Segmind-Vega that allows to reduce the number of inference steps to o | Text-to-Image Generation | $0.002000385751439731 | 1.68497s | [llms.txt](https://www.segmind.com/models/segmind-vega-rt-v1/llms.txt) | | Simple Vector Flux Lora | Simple_Vector_Flux | Flux is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions | Text-to-Image Generation | $0.0337997876486014 | 36.40649s | [llms.txt](https://www.segmind.com/models/Simple_Vector_Flux/llms.txt) | | SSD-1B | ssd-1b | SSD-1B efficiently generates high-quality, diverse images from text prompts in real-time. | Text-to-Image Generation | $0.004170655368377181 | 2.82059s | [llms.txt](https://www.segmind.com/models/ssd-1b/llms.txt) | | Stable Diffusion 3 Medium Text to Image | stable-diffusion-3-medium-txt2img | Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of res | Text-to-Image Generation | $0.04100084691679047 | 7.37145s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3-medium-txt2img/llms.txt) | | Stable Diffusion 3.5 Large Text to Image | stable-diffusion-3.5-large-txt2img | Stable Diffusion 3.5 Large offers exceptional customizability, efficient performance on consumer hardware, and diverse i | Text-to-Image Generation | $0.013810415239805475 | 17.49587s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3.5-large-txt2img/llms.txt) | | Stable Diffusion 3.5 Turbo Text to Image | stable-diffusion-3.5-turbo-txt2img | Stable Diffusion 3.5 Turbo offers exceptional customizability, efficient performance on consumer hardware, and diverse i | Text-to-Image Generation | $0.003497373127186722 | 4.83746s | [llms.txt](https://www.segmind.com/models/stable-diffusion-3.5-turbo-txt2img/llms.txt) | | Stable Diffusion XL 1.0 | sdxl1.0-txt2img | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software | Text-to-Image Generation | $0.007976094449291276 | 6.26414s | [llms.txt](https://www.segmind.com/models/sdxl1.0-txt2img/llms.txt) | | Wan 2.7 Image Generation | wan2.7-image | 2K image generation with precise multilingual text rendering. | Text-to-Image Generation | $0.0375 | 21.52606s | [llms.txt](https://www.segmind.com/models/wan2.7-image/llms.txt) | | Wan 2.7 Image Generation Pro | wan2.7-image-pro | 4K images with chain-of-thought reasoning and multilingual text. | Text-to-Image Generation | $0.0375 | 39.03948s | [llms.txt](https://www.segmind.com/models/wan2.7-image-pro/llms.txt) | | WildCard Lightning SDXL | sdxl1.0-wildcard-lightning | WildCard Lightning SDXL is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images i | Text-to-Image Generation | $0.005027039929891649 | 3.42103s | [llms.txt](https://www.segmind.com/models/sdxl1.0-wildcard-lightning/llms.txt) | | Yamer's Realistic SDXL | sdxl1.0-yamers-realistic | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.008998636289076828 | 5.70452s | [llms.txt](https://www.segmind.com/models/sdxl1.0-yamers-realistic/llms.txt) | | Z Image Turbo | z-image-turbo | Photorealistic images in under one second, bilingual text. | Text-to-Image Generation | $0.030655437032967026 | 6.58321s | [llms.txt](https://www.segmind.com/models/z-image-turbo/llms.txt) | | Zavychroma SDXL | sdxl1.0-zavychroma | The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software. | Text-to-Image Generation | $0.010168409796996965 | 6.03496s | [llms.txt](https://www.segmind.com/models/sdxl1.0-zavychroma/llms.txt) | ## Image-to-Image Transformation | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | AI Product Photo Editor | ai-product-photo-editor | AI Product Photo Editor leverages advanced image-based ML techniques to generate high-quality product visuals using text | Image-to-Image Transformation | $0.022241418080724867 | 15.50389s | [llms.txt](https://www.segmind.com/models/ai-product-photo-editor/llms.txt) | | AI Product Photography | ai-product-photography | Elevate your product imagery with our AI-powered photography model. Create stunning, professional-quality photos that bo | Image-to-Image Transformation | $0.06473440478384125 | 11.65235s | [llms.txt](https://www.segmind.com/models/ai-product-photography/llms.txt) | | Aura Flow | aura-flow | Largest completely open sourced flow-based generation model that is capable of text-to-image generation | Image-to-Image Transformation | $0.11701278422174842 | 79.15338s | [llms.txt](https://www.segmind.com/models/aura-flow/llms.txt) | | Automatic Mask Generator | automatic-mask-generator | Automatic Mask Generator is a powerful tool that automates the creation of precise masks for inpainting | Image-to-Image Transformation | $0.0015574721135883614 | 1.70042s | [llms.txt](https://www.segmind.com/models/automatic-mask-generator/llms.txt) | | Background Removal | bg-removal | This model removes the background image from any image | Image-to-Image Transformation | $0.002097570303297566 | 1.65848s | [llms.txt](https://www.segmind.com/models/bg-removal/llms.txt) | | Background Removal V2 | bg-removal-v2 | This model removes the background image from any image | Image-to-Image Transformation | $0.0008893992875745101 | 0.69606s | [llms.txt](https://www.segmind.com/models/bg-removal-v2/llms.txt) | | Bria Blur Background | bria-blur-background | Bria AI Image Editing API v2 enables precise and context-aware image manipulation for stunning visual outcomes. | Image-to-Image Transformation | $0.053043478260869574 | 16.51626s | [llms.txt](https://www.segmind.com/models/bria-blur-background/llms.txt) | | Bria Enhance Image | bria-enhance-image | Bria AI creates precise, high-quality image enhancements and manipulations for diverse creative applications. | Image-to-Image Transformation | $0.0404426559356137 | 24.62666s | [llms.txt](https://www.segmind.com/models/bria-enhance-image/llms.txt) | | Bria Erase Foreground | bria-erase-foreground | Seamlessly removes foreground subjects and regenerates backgrounds for flawless image editing. | Image-to-Image Transformation | $0.042727272727272725 | 12.00002s | [llms.txt](https://www.segmind.com/models/bria-erase-foreground/llms.txt) | | Bria Eraser | bria-eraser | AI object removal with seamless context-aware inpainting. | Image-to-Image Transformation | $0.04176470588235294 | 14.03854s | [llms.txt](https://www.segmind.com/models/bria-eraser/llms.txt) | | Bria Expand Image | bria-expand-image | Bria Expand enables precise image manipulation and enhancement with generative AI, trained exclusively on licensed data | Image-to-Image Transformation | $0.039004149377593327 | 13.6012s | [llms.txt](https://www.segmind.com/models/bria-expand-image/llms.txt) | | Bria Generate Background | bria-replace-background | Transform images through advanced background editing and generative content creation for diverse applications. | Image-to-Image Transformation | $0.04142857142857145 | 18.72365s | [llms.txt](https://www.segmind.com/models/bria-replace-background/llms.txt) | | Bria Generative Fill | bria-gen-fill | Bria AI enables precise generative image editing for seamless creative enhancements and transformations. | Image-to-Image Transformation | $0.03787878787878789 | 18.44121s | [llms.txt](https://www.segmind.com/models/bria-gen-fill/llms.txt) | | Bria Increase Resolution | bria-increase-resolution | Seamlessly upscale and manipulate images while preserving the highest fidelity and safety standards. | Image-to-Image Transformation | $0.036757762991128026 | 12.51979s | [llms.txt](https://www.segmind.com/models/bria-increase-resolution/llms.txt) | | Bria Lifestyle Product Shot by Text | bria-lifestyle-shot-by-text | Transform isolated product images into dynamic lifestyle scenes with AI-driven contextual realism. | Image-to-Image Transformation | $0.03861635220125788 | 25.26646s | [llms.txt](https://www.segmind.com/models/bria-lifestyle-shot-by-text/llms.txt) | | Bria Product Cutout | bria-product-cutout | Automates precise product cutouts and background removal for professional eCommerce imagery at scale. | Image-to-Image Transformation | $0.04 | 10.92597s | [llms.txt](https://www.segmind.com/models/bria-product-cutout/llms.txt) | | Bria Product Packshot | bria-product-packshot | Transform product photos into professional, market-ready images with intelligent enhancements and background removal. | Image-to-Image Transformation | $0.0409375 | 16.5566s | [llms.txt](https://www.segmind.com/models/bria-product-packshot/llms.txt) | | Bria Product Shadow | bria-product-shadow | Bria Product Shadow enhances product images with realistic shadows for professional eCommerce presentations. | Image-to-Image Transformation | $0.03864197530864198 | 8.40804s | [llms.txt](https://www.segmind.com/models/bria-product-shadow/llms.txt) | | Bria Reimagine | bria-reimagine | Bria AI Reimagine transforms reference images into detailed, styled visuals with creative flexibility. | Image-to-Image Transformation | $0.04113924050632911 | 13.27925s | [llms.txt](https://www.segmind.com/models/bria-reimagine/llms.txt) | | Bria RMBG 2.0 | bria-remove-background | Effortlessly extract backgrounds with unmatched precision, powered by models trained exclusively on licensed data for sa | Image-to-Image Transformation | $0.017932757782101176 | 10.9207s | [llms.txt](https://www.segmind.com/models/bria-remove-background/llms.txt) | | Caricature Style | caricature-style | Transform everyday photos into lively, whimsical caricature illustrations that highlight individual features with playfu | Image-to-Image Transformation | $0.09814871371458372 | 53.12968s | [llms.txt](https://www.segmind.com/models/caricature-style/llms.txt) | | Clarity Upscaler | clarity-upscaler | High resolution creative image Upscaler and Enhancer. A free Magnific alternative. | Image-to-Image Transformation | $0.019290546262842614 | 18.27698s | [llms.txt](https://www.segmind.com/models/clarity-upscaler/llms.txt) | | ClarityAI Creative Upscaler | clarityai-creative-upscaler | Creative image upscaling with fine detail enhancement. | Image-to-Image Transformation | $0.40403422982885084 | 131.0447s | [llms.txt](https://www.segmind.com/models/clarityai-creative-upscaler/llms.txt) | | ClarityAI Crystal Upscaler | clarityai-crystal-upscaler | Upscale images up to 200x with enhanced detail and vibrancy. | Image-to-Image Transformation | $0.5203234265734266 | 34.75041s | [llms.txt](https://www.segmind.com/models/clarityai-crystal-upscaler/llms.txt) | | ClarityAI Flux Upscaler | clarityai-flux-upscaler | Transform low-resolution images into stunning high-quality visuals. | Image-to-Image Transformation | $1.6839788732394365 | 323.32283s | [llms.txt](https://www.segmind.com/models/clarityai-flux-upscaler/llms.txt) | | Codeformer | codeformer | CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces. | Image-to-Image Transformation | $0.01237333921297771 | 5.92044s | [llms.txt](https://www.segmind.com/models/codeformer/llms.txt) | | Consistent Character | consistent-character | Create images of a given character in different poses | Image-to-Image Transformation | $0.0845939208834292 | 61.36668s | [llms.txt](https://www.segmind.com/models/consistent-character/llms.txt) | | Consistent Character With Pose | consistent-character-with-pose | Create images of a given character in different poses | Image-to-Image Transformation | $0.02943020679334449 | 30.9992s | [llms.txt](https://www.segmind.com/models/consistent-character-with-pose/llms.txt) | | ControlNet Canny | sd1.5-controlnet-canny | This model corresponds to the ControlNet conditioned on Canny edges. | Image-to-Image Transformation | $0.003700848409745236 | 4.66892s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-canny/llms.txt) | | ControlNet Depth | sd1.5-controlnet-depth | This model corresponds to the ControlNet conditioned on Depth estimation. | Image-to-Image Transformation | $0.009929038383475731 | 12.63628s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-depth/llms.txt) | | ControlNet Openpose | sd1.5-controlnet-openpose | This model corresponds to the ControlNet conditioned on Human Pose Estimation. | Image-to-Image Transformation | $0.00611304704697142 | 10.00272s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-openpose/llms.txt) | | ControlNet Scribble | sd1.5-controlnet-scribble | This model corresponds to the ControlNet conditioned on Scribble images. | Image-to-Image Transformation | $0.0029691070185857158 | 3.80087s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-scribble/llms.txt) | | ControlNet Soft Edge | sd1.5-controlnet-softedge | This model corresponds to the ControlNet conditioned on Soft Edge. | Image-to-Image Transformation | $0.002842180637190888 | 3.42821s | [llms.txt](https://www.segmind.com/models/sd1.5-controlnet-softedge/llms.txt) | | ESRGAN | esrgan | ERGAN is an Image Super-Resolution (upscaler) model that enhances images with stunning, high-quality upscaling while pre | Image-to-Image Transformation | $0.0046643526179601736 | 4.92678s | [llms.txt](https://www.segmind.com/models/esrgan/llms.txt) | | Expression Editor | expression-editor | Expression Editor uses reference images to accurately generate new images with desired expressions. Perfect for digital | Image-to-Image Transformation | $0.0015538020119496036 | 2.31514s | [llms.txt](https://www.segmind.com/models/expression-editor/llms.txt) | | Face Detailer | face-detailer | Restore characters' faces to their original glory with Face Detailer. Enhance facial details, eliminate distortion, and | Image-to-Image Transformation | $0.01488621607186064 | 16.32891s | [llms.txt](https://www.segmind.com/models/face-detailer/llms.txt) | | face-to-many | face-to-many | Turn a face into 3D, emoji, pixel art, video game, claymation or toy | Image-to-Image Transformation | $0.024571864690289144 | 22.46034s | [llms.txt](https://www.segmind.com/models/face-to-many/llms.txt) | | face-to-sticker | face-to-sticker | Turn a face into a sticker | Image-to-Image Transformation | $0.0865603507661326 | 65.40722s | [llms.txt](https://www.segmind.com/models/face-to-sticker/llms.txt) | | Faceswap | sd2.1-faceswapper | Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. N | Image-to-Image Transformation | $0.02248596119238562 | 36.76486s | [llms.txt](https://www.segmind.com/models/sd2.1-faceswapper/llms.txt) | | Faceswap V2 | faceswap-v2 | Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. N | Image-to-Image Transformation | $0.0042687928216279085 | 3.19248s | [llms.txt](https://www.segmind.com/models/faceswap-v2/llms.txt) | | Faceswap V3 | faceswap-v3 | Face Swap V3 is a cutting-edge tool that empowers you to seamlessly swap faces in images. With customizable features and | Image-to-Image Transformation | $0.005808887588289018 | 4.28326s | [llms.txt](https://www.segmind.com/models/faceswap-v3/llms.txt) | | Faceswap V3 Multifaceswap | faceswap-v3-multifaceswap | Faceswap V3 Multifaceswap enables realistic face swapping in images, preserving lighting and expressions for professiona | Image-to-Image Transformation | $0.0076014094998761925 | 6.60218s | [llms.txt](https://www.segmind.com/models/faceswap-v3-multifaceswap/llms.txt) | | Flux 2 Flex | flux-2-flex | Consistent-style photorealistic images using reference inputs. | Image-to-Image Transformation | $0.17055739514348783 | 47.96836s | [llms.txt](https://www.segmind.com/models/flux-2-flex/llms.txt) | | Flux 2 Max | flux-2-max | Photorealistic images with maximum consistency and fine detail. | Image-to-Image Transformation | $0.21973192019950127 | 53.66553s | [llms.txt](https://www.segmind.com/models/flux-2-max/llms.txt) | | Flux 2 Pro | flux-2-pro | High-quality photorealistic images with cross-output consistency. | Image-to-Image Transformation | $0.07247651457055213 | 26.09079s | [llms.txt](https://www.segmind.com/models/flux-2-pro/llms.txt) | | Flux Canny Dev | flux-canny-dev | Open-weight edge-guided image generation. Control structure and composition using Canny edge detection. | Image-to-Image Transformation | $0.03125 | 20.67341s | [llms.txt](https://www.segmind.com/models/flux-canny-dev/llms.txt) | | Flux Canny Pro | flux-canny-pro | Professional edge-guided image generation. Control structure and composition using Canny edge detection | Image-to-Image Transformation | $0.06248089457243265 | 24.32486s | [llms.txt](https://www.segmind.com/models/flux-canny-pro/llms.txt) | | Flux Controlnets | flux-controlnet | Flux ControlNets is a collection of models that gives you precise control over image generation. By integrating ControlN | Image-to-Image Transformation | $0.0468823807162098 | 48.66877s | [llms.txt](https://www.segmind.com/models/flux-controlnet/llms.txt) | | Flux Depth Dev | flux-depth-dev | Open-weight depth-aware image generation. Edit images while preserving spatial relationships. | Image-to-Image Transformation | $0.03125 | 16.03276s | [llms.txt](https://www.segmind.com/models/flux-depth-dev/llms.txt) | | Flux Depth Pro | flux-depth-pro | Professional depth-aware image generation. Edit images while preserving spatial relationships. | Image-to-Image Transformation | $0.062476545879212544 | 25.41634s | [llms.txt](https://www.segmind.com/models/flux-depth-pro/llms.txt) | | Flux Fill Dev | flux-fill-dev | Open-weight inpainting model for editing and extending images. Guidance-distilled from FLUX.1 Fill Dev | Image-to-Image Transformation | $0.04999999999999993 | 16.62558s | [llms.txt](https://www.segmind.com/models/flux-fill-dev/llms.txt) | | Flux Fill Pro | flux-fill-pro | Professional inpainting and outpainting model with state-of-the-art performance. Edit or extend images with natural, sea | Image-to-Image Transformation | $0.06233509590461379 | 23.56603s | [llms.txt](https://www.segmind.com/models/flux-fill-pro/llms.txt) | | Flux Inpaint | flux-inpaint | Flux Inpainting is a powerful image editing tool designed to effortlessly edit and enhance your images. It's perfect for | Image-to-Image Transformation | $0.02438738535413639 | 28.43018s | [llms.txt](https://www.segmind.com/models/flux-inpaint/llms.txt) | | Flux Ipadapter | flux-ipadapter | Flux IP Adapter is a cutting-edge AI model that lets you to create stunning, customized images. With its advanced style | Image-to-Image Transformation | $0.07432834362068963 | 76.08154s | [llms.txt](https://www.segmind.com/models/flux-ipadapter/llms.txt) | | Flux Kontext Max | flux-kontext-max | FLUX.1 Kontext [max] transforms textual descriptions into stunning, high-fidelity images with seamless typography integr | Image-to-Image Transformation | $0.10000000000000042 | 24.36245s | [llms.txt](https://www.segmind.com/models/flux-kontext-max/llms.txt) | | Flux Kontext Pro | flux-kontext-pro | FLUX.1 Kontext Pro transforms text prompts into high-quality, customized images with remarkable efficiency and precision | Image-to-Image Transformation | $0.04999999999999987 | 21.77541s | [llms.txt](https://www.segmind.com/models/flux-kontext-pro/llms.txt) | | Flux Krea Dev | flux-krea-dev | FLUX.1 Krea generates stunning, photorealistic images with fine-tuned aesthetic control for diverse creative application | Image-to-Image Transformation | $0.031943892432432446 | 24.82919s | [llms.txt](https://www.segmind.com/models/flux-krea-dev/llms.txt) | | Flux Pulid | flux-pulid | Flux PuLID: Customize AI-generated images with your unique identity. Seamlessly integrate faces into text-to-image model | Image-to-Image Transformation | $0.03616867535285016 | 13.083s | [llms.txt](https://www.segmind.com/models/flux-pulid/llms.txt) | | Flux Redux Dev | flux-redux-dev | Open-weight image variation model. Create new versions while preserving key elements of your original. | Image-to-Image Transformation | $0.03125 | 15.84067s | [llms.txt](https://www.segmind.com/models/flux-redux-dev/llms.txt) | | Flux Redux Schnell | flux-redux-schnell | Fast, efficient image variation model for rapid iteration and experimentation. | Image-to-Image Transformation | $0.00375 | 7.67963s | [llms.txt](https://www.segmind.com/models/flux-redux-schnell/llms.txt) | | Flux-2 Klein-4b | flux-2-klein-4b | Sub-second photorealistic image generation and editing. | Image-to-Image Transformation | $0.031852410947562096 | 10.65235s | [llms.txt](https://www.segmind.com/models/flux-2-klein-4b/llms.txt) | | Flux-2 Klein-9b | flux-2-klein-9b | Ultra-fast photorealistic image generation on consumer GPUs. | Image-to-Image Transformation | $0.04019102186915888 | 15.39635s | [llms.txt](https://www.segmind.com/models/flux-2-klein-9b/llms.txt) | | Flux.1 Image To Image | flux-img2img | Flux Image-To-Image model by Black Forest Labs is an advanced deep learning tool designed for transforming images based | Image-to-Image Transformation | $0.026470672997882975 | 25.00478s | [llms.txt](https://www.segmind.com/models/flux-img2img/llms.txt) | | FLUX.1 Kontext [dev] | flux-kontext-dev | FLUX.1 Kontext [dev] creates coherent and editable images by integrating text and visual cues for iterative design. | Image-to-Image Transformation | $0.04000159331124784 | 10.78987s | [llms.txt](https://www.segmind.com/models/flux-kontext-dev/llms.txt) | | Font Sheet Generator | font-sheet-generator | Transforms images into unique, custom font sets in minutes, revolutionizing typography design. | Image-to-Image Transformation | $0.09303416666666668 | 32.89264s | [llms.txt](https://www.segmind.com/models/font-sheet-generator/llms.txt) | | Fooocus | fooocus | Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney. | Image-to-Image Transformation | $0.061833064751358115 | 20.21113s | [llms.txt](https://www.segmind.com/models/fooocus/llms.txt) | | Fooocus Outpainting | focus-outpaint | Fooocus Outpainting transforms ordinary images into extraordinary works of art by seamlessly expanding their boundaries. | Image-to-Image Transformation | $0.024537408137992028 | 15.64931s | [llms.txt](https://www.segmind.com/models/focus-outpaint/llms.txt) | | GPT Image 1 Edit | gpt-image-1-edit | Edit and compose images using natural language with GPT Image 1 Edit, OpenAI’s powerful inpainting and multi-reference e | Image-to-Image Transformation | $0.14067419260677325 | 57.8164s | [llms.txt](https://www.segmind.com/models/gpt-image-1-edit/llms.txt) | | GPT Image 1 Edit Mini | gpt-image-1-edit-mini | Affordable text-driven image generation and editing. | Image-to-Image Transformation | $0.027387329232333505 | 42.33248s | [llms.txt](https://www.segmind.com/models/gpt-image-1-edit-mini/llms.txt) | | GPT Image 1.5 Edit | gpt-image-1.5-edit | Precise image editing via natural language instructions. | Image-to-Image Transformation | $0.20793580058224165 | 51.12257s | [llms.txt](https://www.segmind.com/models/gpt-image-1.5-edit/llms.txt) | | HiDream-I1 (Fast) | hidream-l1-fast | HiDream-I1 is a next-generation, open-source image generative foundation model designed for text-to-image synthesis, esp | Image-to-Image Transformation | $0.01521270973837209 | 10.42825s | [llms.txt](https://www.segmind.com/models/hidream-l1-fast/llms.txt) | | Higgsfield Text 2 Image Soul | higgsfield-text2image-soul | SOUL AI transforms text into stunning, customizable visuals with unparalleled style control and precision. | Image-to-Image Transformation | $0.2175372897492859 | 40.00115s | [llms.txt](https://www.segmind.com/models/higgsfield-text2image-soul/llms.txt) | | HyperSwap Image Faceswap by FaceFusion Labs | hyperswap-image-faceswap-by-facefusion-labs | High-quality face swapping built for real production workflows. | Image-to-Image Transformation | $0.09999999999999999 | 9.1342s | [llms.txt](https://www.segmind.com/models/hyperswap-image-faceswap-by-facefusion-labs/llms.txt) | | Ideogram 2a Image to Image | ideogram-2a-img-2-img | Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced | Image-to-Image Transformation | $0.05000000000000001 | 10.11784s | [llms.txt](https://www.segmind.com/models/ideogram-2a-img-2-img/llms.txt) | | Ideogram 3 Reframe | ideogram-3-reframe | Ideogram 3.0's Reframe effortlessly adapts images to diverse formats, enhancing visual content creation for any platform | Image-to-Image Transformation | $0.04159392672731744 | 11.31905s | [llms.txt](https://www.segmind.com/models/ideogram-3-reframe/llms.txt) | | Ideogram 3 Remix | ideogram-3-remix | Ideogram 3 Remix enables versatile image transformation, enhancing creativity through customizable design iterations. | Image-to-Image Transformation | $0.07462871287128711 | 10.54744s | [llms.txt](https://www.segmind.com/models/ideogram-3-remix/llms.txt) | | Ideogram 3 Replace Background | ideogram-3-replace-background | Effortlessly replace backgrounds in images, enhancing visual storytelling and creativity with precision and speed. | Image-to-Image Transformation | $0.09174859550561797 | 14.27969s | [llms.txt](https://www.segmind.com/models/ideogram-3-replace-background/llms.txt) | | Ideogram Character | ideogram-character | Achieve perfect character consistency across multiple generations from a single reference image. | Image-to-Image Transformation | $0.2326470318559557 | 19.98518s | [llms.txt](https://www.segmind.com/models/ideogram-character/llms.txt) | | Ideogram Image To Image | ideogram-img-2-img | Ideogram Image to Image: Transform your images with ease! Enhance, modify, or create entirely new visuals using advanced | Image-to-Image Transformation | $0.10000000000000017 | 12.67675s | [llms.txt](https://www.segmind.com/models/ideogram-img-2-img/llms.txt) | | Ideogram Reframe | ideogram-reframe | Transform your images with Ideogram Reframe! Easily reframe square images to your chosen resolution. | Image-to-Image Transformation | $0.09999999999999998 | 23.34343s | [llms.txt](https://www.segmind.com/models/ideogram-reframe/llms.txt) | | Ideogram Turbo Image To Image | ideogram-turbo-img-2-img | Transform images instantly with Ideogram Turbo Image to Image! Fast AI for quick edits & creative remixes. | Image-to-Image Transformation | $0.06300000000000003 | 11.0242s | [llms.txt](https://www.segmind.com/models/ideogram-turbo-img-2-img/llms.txt) | | IDM VTON | idm-vton | Best-in-class clothing virtual try on in the wild | Image-to-Image Transformation | $0.044285033715220946 | 10.72924s | [llms.txt](https://www.segmind.com/models/idm-vton/llms.txt) | | illusion-diffusion-hq | illusion-diffusion-hq | Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1 | Image-to-Image Transformation | $0.04254304087134279 | 60.12161s | [llms.txt](https://www.segmind.com/models/illusion-diffusion-hq/llms.txt) | | Image Superimpose | superimpose | Superimpose model lets you to create captivating visuals by seamlessly overlaying one image on top of another. It stream | Image-to-Image Transformation | $0.000516212389380531 | 0.64096s | [llms.txt](https://www.segmind.com/models/superimpose/llms.txt) | | Image Superimpose V2 | superimpose-v2 | Superimpose V2 elevates image editing! Seamlessly layer images with background removal, precise positioning, and flexibl | Image-to-Image Transformation | $0.0018924620997221814 | 2.24128s | [llms.txt](https://www.segmind.com/models/superimpose-v2/llms.txt) | | Infinite You | infinite-you | InfiniteYou generates high-fidelity portraits preserving identity while aligning with creative text prompts. | Image-to-Image Transformation | $0.18616816406779663 | 164.55799s | [llms.txt](https://www.segmind.com/models/infinite-you/llms.txt) | | Inpaint Mask Maker | inpaint-mask-maker | Real-Time Open-Vocabulary Object Detection | Image-to-Image Transformation | $0.004548485514777525 | 8.12156s | [llms.txt](https://www.segmind.com/models/inpaint-mask-maker/llms.txt) | | Insta Depth | insta-depth | InstantID aims to generate customized images with various poses or styles from only a single reference ID image while en | Image-to-Image Transformation | $0.052145482264241795 | 15.17449s | [llms.txt](https://www.segmind.com/models/insta-depth/llms.txt) | | InstantID | instantid | InstantID aims to generate customized images with various poses or styles from only a single reference ID image while en | Image-to-Image Transformation | $0.02513327221923347 | 7.73523s | [llms.txt](https://www.segmind.com/models/instantid/llms.txt) | | IP-adapter Canny XL | ip-sdxl-canny | IP Adpater XL Canny is built on the SDXL framework. This model integrates the IP Adapter and Canny edge preprocessor to | Image-to-Image Transformation | $0.011348650120825329 | 14.22441s | [llms.txt](https://www.segmind.com/models/ip-sdxl-canny/llms.txt) | | IP-adapter Depth XL | ip-sdxl-depth | IP Adapter Depth XL is built on the SDXL framework. This model integrates the IP Adapter and Depth preprocessor to offer | Image-to-Image Transformation | $0.015832852252854652 | 21.39794s | [llms.txt](https://www.segmind.com/models/ip-sdxl-depth/llms.txt) | | IP-adapter Openpose XL | ip-sdxl-openpose | IP Adapter XL Openpose is built on the SDXL framework. This model integrates the IP Adapter and Openpose preprocessor to | Image-to-Image Transformation | $0.013783720446760989 | 15.4171s | [llms.txt](https://www.segmind.com/models/ip-sdxl-openpose/llms.txt) | | IPAdapter Style Transfer | style-transfer | Style & Composition Transfer with Stable Diffusion IP Adapter | Image-to-Image Transformation | $0.025668324416977615 | 16.64547s | [llms.txt](https://www.segmind.com/models/style-transfer/llms.txt) | | Kling O1 | kling-o1 | Text-to-video creation with precise AI-driven motion control. | Image-to-Image Transformation | $0.035 | 59.52196s | [llms.txt](https://www.segmind.com/models/kling-o1/llms.txt) | | Kling V3 Image 2 Image | kling-3-image2image | Transform images into photorealistic, production-ready visuals. | Image-to-Image Transformation | $0.035 | 68.9175s | [llms.txt](https://www.segmind.com/models/kling-3-image2image/llms.txt) | | Kolors | kolors | Kolors is a cutting-edge text-to-image model that bridges language and visual art. Transform your textual ideas into pho | Image-to-Image Transformation | $0.09548960481352996 | 84.60946s | [llms.txt](https://www.segmind.com/models/kolors/llms.txt) | | Lifestyle Product Shot by Image | bria-lifestyle-shot-by-image | Transforms ordinary product images into stunning, marketing-ready visuals for eCommerce success. | Image-to-Image Transformation | $0.031020408163265307 | 20.98771s | [llms.txt](https://www.segmind.com/models/bria-lifestyle-shot-by-image/llms.txt) | | Magic Eraser | magic-eraser | LaMA Object Removal- AI Magic Eraser | Image-to-Image Transformation | $0.000159575151144529 | 0.78151s | [llms.txt](https://www.segmind.com/models/magic-eraser/llms.txt) | | material-transfer | material-transfer | Transfer a material from an image to a subject | Image-to-Image Transformation | $0.251970550235849 | 164.02658s | [llms.txt](https://www.segmind.com/models/material-transfer/llms.txt) | | Minimax-image-01 | image-01 | Generate high-fidelity images from text with precise control & stunning quality with Minimax Image-01. | Image-to-Image Transformation | $0.012520856467121588 | 36.23202s | [llms.txt](https://www.segmind.com/models/image-01/llms.txt) | | Multi Image Kontext Max | multi-image-kontext-max | FLUX.1 Kontext [max] creates stunning, photorealistic images from text prompts and input images seamlessly. | Image-to-Image Transformation | $0.08793129526854218 | 18.3156s | [llms.txt](https://www.segmind.com/models/multi-image-kontext-max/llms.txt) | | Multi Image Kontext Pro | multi-image-kontext-pro | Transform text into stunning, professional-grade images with precise editing capabilities. | Image-to-Image Transformation | $0.049999999999999975 | 22.90112s | [llms.txt](https://www.segmind.com/models/multi-image-kontext-pro/llms.txt) | | Nano Banana 2 | nano-banana-2 | Fast photorealistic images — ideal for marketing and ads. | Image-to-Image Transformation | $0.0923011318546232 | 39.30415s | [llms.txt](https://www.segmind.com/models/nano-banana-2/llms.txt) | | Nano Banana Pro | nano-banana-pro | High-fidelity images with accurate multilingual text rendering. | Image-to-Image Transformation | $0.16494746411651323 | 61.14607s | [llms.txt](https://www.segmind.com/models/nano-banana-pro/llms.txt) | | Nomos Image Upscaler 4k | nomos-upscaler | This upscaling model is ideal for enhancing amateur to professional photos, excelling with subjects like cats, hair, and | Image-to-Image Transformation | $0.01154986986276634 | 9.08879s | [llms.txt](https://www.segmind.com/models/nomos-upscaler/llms.txt) | | Omini Control | ominicontrol | OminiControl is an innovative framework that optimizes Diffusion Transformer models for versatile image generation tasks | Image-to-Image Transformation | $0.0041856582476405115 | 4.50113s | [llms.txt](https://www.segmind.com/models/ominicontrol/llms.txt) | | Omni Zero | omni-zero | Omni-Zero: A diffusion pipeline for zero-shot stylized portrait creation. | Image-to-Image Transformation | $0.17303785389324666 | 149.10339s | [llms.txt](https://www.segmind.com/models/omni-zero/llms.txt) | | Profile Photo Style Transfer | become-image | Turn any image of a face into artwork using Stable Diffusion Controlnet and IPAdapter | Image-to-Image Transformation | $0.09456781571354969 | 63.7612s | [llms.txt](https://www.segmind.com/models/become-image/llms.txt) | | Pruna P Image Edit | p-image-edit | Multi-image editing with AI-guided precision and control. | Image-to-Image Transformation | $0.010000000000000002 | 7.5032s | [llms.txt](https://www.segmind.com/models/p-image-edit/llms.txt) | | PuLID | pulid-base | Novel tuning-free ID customization method for text-to-image generation. | Image-to-Image Transformation | $0.20444191764255643 | 70.51177s | [llms.txt](https://www.segmind.com/models/pulid-base/llms.txt) | | Qwen Image Edit | qwen-image-edit | Transform images effortlessly through semantic context and pixel-perfect appearance changes. | Image-to-Image Transformation | $0.1993516979032258 | 48.88752s | [llms.txt](https://www.segmind.com/models/qwen-image-edit/llms.txt) | | Qwen Image Edit Fast | qwen-image-edit-fast | Qwen-Image-Edit enables precise bilingual image editing for seamless localization and professional content creation. | Image-to-Image Transformation | $0.0364297414955695 | 8.73196s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-fast/llms.txt) | | Qwen Image Edit Plus | qwen-image-edit-plus | Multi-image editing with precise text-guided transformations. | Image-to-Image Transformation | $0.03525215415395049 | 13.64518s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus/llms.txt) | | Qwen Image Edit Plus Add People Lora | qwen-image-edit-plus-add-people | Generate realistic multi-character scenes with natural interactions. | Image-to-Image Transformation | $0.09862558602150537 | 23.00519s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-add-people/llms.txt) | | Qwen Image Edit Plus Blend It | qwen-image-edit-plus-blend-it | Product placement into backgrounds with precise lighting match. | Image-to-Image Transformation | $0.08165602641509435 | 18.06678s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-blend-it/llms.txt) | | Qwen Image Edit Plus Eigen Banana | qwen-image-edit-plus-eigen-banana | Precise text-guided image transformation and creative editing. | Image-to-Image Transformation | $0.08899248170391061 | 21.66085s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-eigen-banana/llms.txt) | | Qwen Image Edit Plus Eraser | qwen-image-edit-plus-eraser | Remove unwanted objects while preserving realistic backgrounds. | Image-to-Image Transformation | $0.07835796075085324 | 19.94357s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-eraser/llms.txt) | | Qwen Image Edit Plus Face To Portrait | qwen-image-edit-plus-face-to-portrait | Cropped face into full identity-preserving portrait photo. | Image-to-Image Transformation | $0.07428876108786611 | 17.85171s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-face-to-portrait/llms.txt) | | Qwen Image Edit Plus Group Photo | qwen-image-edit-plus-group-photo | Merge individual portraits into realistic group photos. | Image-to-Image Transformation | $0.10533954509803922 | 23.72663s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-group-photo/llms.txt) | | Qwen Image Edit Plus Multi Lora | qwen-image-edit-plus-multi-lora | Multi-image editing with superior identity and style control. | Image-to-Image Transformation | $0.0861332302359882 | 20.40219s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-multi-lora/llms.txt) | | Qwen Image Edit Plus Multiple Angles | qwen-image-edit-plus-multiple-angle | Transform image perspective with natural language prompts. | Image-to-Image Transformation | $0.0958050966442953 | 22.53001s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-multiple-angle/llms.txt) | | Qwen Image Edit Plus Next Scene | qwen-image-edit-plus-next-scene | Create cinematic sequences with seamless visual continuity. | Image-to-Image Transformation | $0.09455228241469817 | 22.30941s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-next-scene/llms.txt) | | Qwen Image Edit Plus Product Photography | qwen-image-edit-plus-product-photography | Transform white-background products into immersive lifestyle scenes. | Image-to-Image Transformation | $0.08908184624145789 | 20.64443s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-product-photography/llms.txt) | | Qwen Image Edit Plus Relight | qwen-image-edit-plus-relight | Advanced image relighting using natural language prompts. | Image-to-Image Transformation | $0.10313037307692309 | 21.31752s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-relight/llms.txt) | | Qwen Image Edit Plus Remove Lighting | qwen-image-edit-plus-remove-lighting | Remove artificial lighting effects and restore natural tones. | Image-to-Image Transformation | $0.07965391666666666 | 18.8681s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-remove-lighting/llms.txt) | | Qwen Image Edit Plus Texture Apply | qwen-image-edit-plus-texture-apply | Apply precise textures to images using natural language. | Image-to-Image Transformation | $0.0997037090909091 | 23.41138s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-texture-apply/llms.txt) | | Qwen Image Edit Plus Texture Extract | qwen-image-edit-plus-texture-extract | Extract seamless, tileable textures from photographs. | Image-to-Image Transformation | $0.10225743999999999 | 23.38367s | [llms.txt](https://www.segmind.com/models/qwen-image-edit-plus-texture-extract/llms.txt) | | Relighting | ic-light | Prompts to auto-magically relight your images. | Image-to-Image Transformation | $0.036346569891959085 | 30.6633s | [llms.txt](https://www.segmind.com/models/ic-light/llms.txt) | | Runway Gen 4 Image | runway-gen4-image | Runway's Gen-4 Image API enables precise, multimodal image generation for innovative creative and technical applications | Image-to-Image Transformation | $0.1 | 30.6423s | [llms.txt](https://www.segmind.com/models/runway-gen4-image/llms.txt) | | Sam V2 Image | sam-v2-image | SAM v2, the next-gen segmentation model from Meta AI, revolutionizes computer vision. Building on SAM's success, it exce | Image-to-Image Transformation | $0.00174482463091723 | 1.6846s | [llms.txt](https://www.segmind.com/models/sam-v2-image/llms.txt) | | Sam3 Image | sam3-image | Precise object segmentation and tracking in images. | Image-to-Image Transformation | $0.006994602857424377 | 4.80757s | [llms.txt](https://www.segmind.com/models/sam3-image/llms.txt) | | SD Outpainting | sd1.5-outpaint | Stable Diffusion Outpainting can extend any image in any direction | Image-to-Image Transformation | $0.01091418242906789 | 4.34742s | [llms.txt](https://www.segmind.com/models/sd1.5-outpaint/llms.txt) | | SD3 Medium Canny Controlnet | sd3-med-canny | Stable Diffusion 3 (SD3) Medium Canny ControlNet uses Canny edge detection to provide fine-grained control over the gene | Image-to-Image Transformation | $0.006618628908091122 | 8.78311s | [llms.txt](https://www.segmind.com/models/sd3-med-canny/llms.txt) | | SD3 Medium Pose Controlnet | sd3-med-pose | Stable Diffusion 3 (SD3) Pose ControlNet is a large generative image model tailored for generating images based on text | Image-to-Image Transformation | $0.01660999798816568 | 18.18351s | [llms.txt](https://www.segmind.com/models/sd3-med-pose/llms.txt) | | SD3 Medium Tile Controlnet | sd3-med-tile | SD3 Medium Tile ControlNet is a large generative image model designed for generating detailed images based on textual pr | Image-to-Image Transformation | $0.0076904571630204656 | 8.94072s | [llms.txt](https://www.segmind.com/models/sd3-med-tile/llms.txt) | | SDXL Controlnet | sdxl-controlnet | SDXL ControlNet gives unprecedented control over text-to-image generation. SDXL ControlNet models Introduces the concept | Image-to-Image Transformation | $0.012376808212101168 | 10.90488s | [llms.txt](https://www.segmind.com/models/sdxl-controlnet/llms.txt) | | SDXL Img2Img | sdxl-img2img | SDXL Img2Img is used for text-guided image-to-image translation. This model uses the weights from Stable Diffusion to ge | Image-to-Image Transformation | $0.019466743047955162 | 8.39108s | [llms.txt](https://www.segmind.com/models/sdxl-img2img/llms.txt) | | SDXL-Openpose | sdxl-openpose | This model leverages SDXL to generate the images with ControlNet conditioned on Human Pose Estimation. | Image-to-Image Transformation | $0.008039886633784986 | 8.32961s | [llms.txt](https://www.segmind.com/models/sdxl-openpose/llms.txt) | | SeedEdit 3.0 i2i | seededit-v3 | SeedEdit 3.0 enables seamless, high-quality image edits through advanced AI-driven techniques. | Image-to-Image Transformation | $0.049999999999999975 | 10.85232s | [llms.txt](https://www.segmind.com/models/seededit-v3/llms.txt) | | Seedream 4.0 (4k) | seedream-4 | Seedream 4.0 generates high-resolution, professional-grade visuals with superior text rendering for impactful design. | Image-to-Image Transformation | $0.03506033812070034 | 20.65216s | [llms.txt](https://www.segmind.com/models/seedream-4/llms.txt) | | Seedream 4.5 | seedream-4.5 | Photorealistic image generation with precise text understanding. | Image-to-Image Transformation | $0.040010253956318124 | 31.74427s | [llms.txt](https://www.segmind.com/models/seedream-4.5/llms.txt) | | Seedream 5.0 Lite: Image-to-Image | seedream-v5-lite-image-to-image | Transform images intelligently with detailed text prompts. | Image-to-Image Transformation | $0.03499999999999999 | 47.05796s | [llms.txt](https://www.segmind.com/models/seedream-v5-lite-image-to-image/llms.txt) | | Segment Anything Model | sam-img2img | The Segment Anything Model (SAM) produces high quality object masks from input prompts such as points or boxes, and it c | Image-to-Image Transformation | $0.007594711676599745 | 3.80742s | [llms.txt](https://www.segmind.com/models/sam-img2img/llms.txt) | | Segmind Beyond: Outpaint with Ease | seg-beyond | Effortlessly expand your visuals with AI Image Extend. Intelligently add pixels to any side of your image. | Image-to-Image Transformation | $0.036383991866913115 | 25.37701s | [llms.txt](https://www.segmind.com/models/seg-beyond/llms.txt) | | Segmind FaceSwap Comic v1 | faceswap-comic | FaceSwap Comic v1 is an AI-powered face swapping model designed to blend real faces into illustrated or cartoon-style im | Image-to-Image Transformation | $0.07460677699789497 | 21.1779s | [llms.txt](https://www.segmind.com/models/faceswap-comic/llms.txt) | | Segmind Faceswap v4 | faceswap-v4 | Segmind FaceSwap v4 enables fast and precise face or head swapping between images with customizable options for style, o | Image-to-Image Transformation | $0.1235410645315008 | 32.87619s | [llms.txt](https://www.segmind.com/models/faceswap-v4/llms.txt) | | Segmind Faceswap v5 | faceswap-v5 | Ultra-fast face and head swapping in images. | Image-to-Image Transformation | $0.049976330591144646 | 9.65348s | [llms.txt](https://www.segmind.com/models/faceswap-v5/llms.txt) | | Segmind Relighting | segmind-relighting | Prompts to auto-magically relight your images. | Image-to-Image Transformation | $0.059251471825063066 | 10.33768s | [llms.txt](https://www.segmind.com/models/segmind-relighting/llms.txt) | | Segmind Relighting V2 | segmind-relighting-v2 | Transform images with customizable, photorealistic lighting for unparalleled visual creativity and authenticity. | Image-to-Image Transformation | $0.25820446236559136 | 70.57164s | [llms.txt](https://www.segmind.com/models/segmind-relighting-v2/llms.txt) | | Segmind SceneCraft v0.1 | segmind-scenecraft-v01 | SceneCraft transforms plain or existing product images into visually rich, photorealistic scenes. Whether starting from | Image-to-Image Transformation | $0.33926112521739127 | 33.64033s | [llms.txt](https://www.segmind.com/models/segmind-scenecraft-v01/llms.txt) | | Segmind SegFit v1.1 | segfit-v1.1 | Segmind's Fashion and Immersive Try-on model. SegFIT offers effortless AI virtual try-on from just a product image. No m | Image-to-Image Transformation | $0.4521986935392882 | 68.96038s | [llms.txt](https://www.segmind.com/models/segfit-v1.1/llms.txt) | | Segmind SegFit v1.2 | segfit-v1.2 | SegFit v1.2 creates hyper-realistic virtual try-on images, transforming fashion retail engagement and conversion rates. | Image-to-Image Transformation | $0.09198869015011549 | 51.81403s | [llms.txt](https://www.segmind.com/models/segfit-v1.2/llms.txt) | | Segmind SegFit v1.3 | segfit-v1.3 | SegFit v1.3 enables hyper-realistic virtual try-ons, enhancing online fashion retail experiences without physical photos | Image-to-Image Transformation | $0.21953233037952966 | 37.14528s | [llms.txt](https://www.segmind.com/models/segfit-v1.3/llms.txt) | | Segmind SegSwap v0.1 | seg-swap | Swap Objects Instantly. The Segmind SegSwap v0.1 model enables dynamic and precise image editing by allowing users to re | Image-to-Image Transformation | $0.2872055589597435 | 26.39516s | [llms.txt](https://www.segmind.com/models/seg-swap/llms.txt) | | Skin Contrast Upscaler | skin-contrast-upscaler | Enhances skin detail in images while preserving background quality for professional photography and art. | Image-to-Image Transformation | $0.013142800174367916 | 3.67532s | [llms.txt](https://www.segmind.com/models/skin-contrast-upscaler/llms.txt) | | SSD Img2Img | ssd-img2img | This model uses SSD-1B to generate images by passing a text prompt and an initial image to condition the generation | Image-to-Image Transformation | $0.0033121782447356482 | 3.99695s | [llms.txt](https://www.segmind.com/models/ssd-img2img/llms.txt) | | SSD-Canny | ssd-canny | This model leverages SSD-1B to generate the images with ControlNet conditioned on Canny Images | Image-to-Image Transformation | $0.006194010348729142 | 5.95972s | [llms.txt](https://www.segmind.com/models/ssd-canny/llms.txt) | | SSD-Depth | ssd-depth | This model leverages SSD-1B to generate the images with ControlNet conditioned on Depth Estimation | Image-to-Image Transformation | $0.009080595711003319 | 10.7305s | [llms.txt](https://www.segmind.com/models/ssd-depth/llms.txt) | | Stable Diffusion img2img | sd1.5-img2img | This model uses diffusion-denoising mechanism as first proposed by SDEdit, Stable Diffusion is used for text-guided imag | Image-to-Image Transformation | $0.0037053834433711354 | 7.69591s | [llms.txt](https://www.segmind.com/models/sd1.5-img2img/llms.txt) | | Story Diffusion | storydiffusion | Story Diffusion turns your written narratives into stunning image sequences. | Image-to-Image Transformation | $0.16542968855203616 | 118.53637s | [llms.txt](https://www.segmind.com/models/storydiffusion/llms.txt) | | Supir Photo-Realistic Image Restoration | supir | SUPIR restores and enhances images to stunning, photo-realistic quality with advanced AI techniques. | Image-to-Image Transformation | $5 | - | [llms.txt](https://www.segmind.com/models/supir/llms.txt) | | Text Overlay | text-overlay | Elevate your visuals withText Overlay Model. Easily add customized text to any image, perfect for social media, marketin | Image-to-Image Transformation | $0.0011389110764587526 | 2.11004s | [llms.txt](https://www.segmind.com/models/text-overlay/llms.txt) | | Topaz Labs Image Upscale | topaz-image-upscale | Topaz Labs image upscale is an industry-leading AI photo upscaler designed to increase the resolution of photos while pr | Image-to-Image Transformation | $0.3745939754385965 | 23.43411s | [llms.txt](https://www.segmind.com/models/topaz-image-upscale/llms.txt) | | Transparent Background Maker | transparent-background-maker | Transform your images with Transparent Background Maker. Quickly remove backgrounds using AI technology, supporting PNG | Image-to-Image Transformation | $0.002850137474774654 | 1.36711s | [llms.txt](https://www.segmind.com/models/transparent-background-maker/llms.txt) | | Word2img | w2imgsd1.5-img2img | Create beautifully designed words using Segmind’s word to image for your marketing purposes | Image-to-Image Transformation | $0.0047461707694165 | 10.00899s | [llms.txt](https://www.segmind.com/models/w2imgsd1.5-img2img/llms.txt) | ## Text-to-Audio Generation | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | 3B Orpheus TTS (0.1) | orpheus-3b-0.1 | Orpheus TTS is an open-source text-to-speech (TTS) system powered by the Llama 3B language model, designed for high-qual | Text-to-Audio Generation | $0.12419483106004443 | 117.62101s | [llms.txt](https://www.segmind.com/models/orpheus-3b-0.1/llms.txt) | | Ace Step Music | ace-step-music | ACE-Step generates high-quality music rapidly, enhancing the creative process for developers and artists worldwide. | Text-to-Audio Generation | $0.035132896583850944 | 11.79223s | [llms.txt](https://www.segmind.com/models/ace-step-music/llms.txt) | | Chatterbox TTS | chatterbox-tts | Chatterbox transforms text into rich, natural speech with adjustable emotional expressiveness for diverse applications. | Text-to-Audio Generation | $0.0199414375 | 18.03554s | [llms.txt](https://www.segmind.com/models/chatterbox-tts/llms.txt) | | Chatterbox Turbo TTS | chatterbox-turbo-tts | Ultra-fast, human-quality TTS with emotional expression. | Text-to-Audio Generation | $0.0208593132664437 | 13.39185s | [llms.txt](https://www.segmind.com/models/chatterbox-turbo-tts/llms.txt) | | Dia (Text to Speech) | dia | Dia by Nari Labs is an advanced open-weights TTS model that brings scripts to life with natural speech, emotions, and no | Text-to-Audio Generation | $0.06975758289779323 | 89.54892s | [llms.txt](https://www.segmind.com/models/dia/llms.txt) | | Elevenlabs Dialogue | elevenlabs-dialogue | Immersive, emotionally expressive multi-speaker audio dialogue. | Text-to-Audio Generation | $0.018704464285714283 | 6.76155s | [llms.txt](https://www.segmind.com/models/elevenlabs-dialogue/llms.txt) | | ElevenLabs Dubbing | dubbing | Instantly dubs audio and video into 29 languages while preserving each speaker's original voice. | Text-to-Audio Generation | $0.24967988200000005 | 92.70439s | [llms.txt](https://www.segmind.com/models/dubbing/llms.txt) | | Elevenlabs Sound Generation | sound-generation | Eleven Labs' Sound Generation API provides a robust development tool for programmatically generating audio content using | Text-to-Audio Generation | $0.026501807981803144 | 7.82464s | [llms.txt](https://www.segmind.com/models/sound-generation/llms.txt) | | Elevenlabs Text To Speech | tts-eleven-labs | ElevenLabs TTS transforms text into captivating, human-like speech for diverse applications. | Text-to-Audio Generation | $0.09539751199014937 | 12.2895s | [llms.txt](https://www.segmind.com/models/tts-eleven-labs/llms.txt) | | Gemini TTS 2.5 Flash | gemini-2.5-flash-tts | Fast, lifelike text-to-speech with expressive emotional tones. | Text-to-Audio Generation | $0.004979537479131886 | 17.60247s | [llms.txt](https://www.segmind.com/models/gemini-2.5-flash-tts/llms.txt) | | Gemini TTS 2.5 Pro | gemini-2.5-pro-tts | Human-like speech synthesis with rich expressive emotional depth. | Text-to-Audio Generation | $0.020483156066945608 | 32.569s | [llms.txt](https://www.segmind.com/models/gemini-2.5-pro-tts/llms.txt) | | Lyria 2 | lyria-2 | Lyria 2 by Google DeepMind is an advanced model that generates high-fidelity 48kHz stereo instrumental music from text p | Text-to-Audio Generation | $0.08999999999999997 | 27.23475s | [llms.txt](https://www.segmind.com/models/lyria-2/llms.txt) | | Meta MusicGen Medium | meta-musicgen-medium | MusicGen: Transform text into music with AI. Create unique, high-quality audio from simple descriptions. Experience the | Text-to-Audio Generation | $0.04053212388675097 | 22.29395s | [llms.txt](https://www.segmind.com/models/meta-musicgen-medium/llms.txt) | | Minimax Music-01 | minimax-music-01 | Generate up to 60 seconds of music with both accompaniment and vocals in a single pass, with vocals from lyrics and a re | Text-to-Audio Generation | $0.07049529162790698 | 44.29378s | [llms.txt](https://www.segmind.com/models/minimax-music-01/llms.txt) | | MyShell Text To Speech | myshell-tts | MyShell's Voice Cloning and Text to Speech - Transform your audio content with realistic, personalized voices. Experienc | Text-to-Audio Generation | $0.006335910745629597 | 7.0019s | [llms.txt](https://www.segmind.com/models/myshell-tts/llms.txt) | | Openvoice | openvoice | OpenVoice is a versatile voice cloning model that supports multiple languages and offers precise tone replication, flexi | Text-to-Audio Generation | $0.008993625410928472 | 10.20667s | [llms.txt](https://www.segmind.com/models/openvoice/llms.txt) | | Sam Audio Large | sam-audio-large | Isolate any described sound from mixed audio tracks. | Text-to-Audio Generation | $0.062476201587301584 | 12.91627s | [llms.txt](https://www.segmind.com/models/sam-audio-large/llms.txt) | | Veena TTS | veena-tts | Veena transforms text into high-fidelity, expressive speech in Hindi and English for real-time applications. | Text-to-Audio Generation | $0.055781026515151516 | 45.2031s | [llms.txt](https://www.segmind.com/models/veena-tts/llms.txt) | | VeenaMax TTS | veena-max-tts | VeenaMAX transforms text into expressive, real-time speech across multiple Indian languages for seamless communication. | Text-to-Audio Generation | $0.017146847682119208 | 12.95526s | [llms.txt](https://www.segmind.com/models/veena-max-tts/llms.txt) | ## voice | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Kling Create Voice | kling-create-voice | Clone any voice from a single audio sample. | voice | $0.007 | 26.02065s | [llms.txt](https://www.segmind.com/models/kling-create-voice/llms.txt) | ## imageTo3d | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Hunyuan-3d 2mv | hunyuan3d-2mv | Hunyuan3D-2mv is finetuned from Hunyuan3D-2 to support multiview controlled shape generation. | imageTo3d | $0.3088497100628931 | 100.38197s | [llms.txt](https://www.segmind.com/models/hunyuan3d-2mv/llms.txt) | | Hunyuan3D-2 | hunyuan-3d-2 | Hunyuan3D 2.0 enables the creation of high-quality 3D models with intricate details. Produce assets that are visually ap | imageTo3d | $0.3450042239694657 | 36.90529s | [llms.txt](https://www.segmind.com/models/hunyuan-3d-2/llms.txt) | | Hunyuan3d-2.1 | hunyuan3d-2.1 | Transform 2D images into photorealistic, high-fidelity 3D assets effortlessly. | imageTo3d | $0.15685504241842613 | 149.83768s | [llms.txt](https://www.segmind.com/models/hunyuan3d-2.1/llms.txt) | | Sam 3D Body | sam-3d-body | Reconstruct 3D human body meshes from a single photo. | imageTo3d | $0.019317541176470585 | 10.12074s | [llms.txt](https://www.segmind.com/models/sam-3d-body/llms.txt) | | Sam 3D Object | sam-3d-objects | Single 2D image into detailed 3D object models. | imageTo3d | $0.06384196985583222 | 33.30635s | [llms.txt](https://www.segmind.com/models/sam-3d-objects/llms.txt) | ## Image-to-Text (Vision) | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Bria Fibo | bria-fibo-generate | Photorealistic images from structured prompts with brand control. | Image-to-Text (Vision) | $0.039999999999999994 | 21.60291s | [llms.txt](https://www.segmind.com/models/bria-fibo-generate/llms.txt) | | Bria Fibo Structured Prompt | bria-fibo-generate-structured-prompt | Convert complex inputs into structured JSON prompts for generation. | Image-to-Text (Vision) | $0.01 | 12.50573s | [llms.txt](https://www.segmind.com/models/bria-fibo-generate-structured-prompt/llms.txt) | | Bria Mask Generator | bria-mask-generator | Bria AI Get Masks automatically generates accurate object masks for advanced image editing and enhancement. | Image-to-Text (Vision) | $0.001217757009345794 | 6.85549s | [llms.txt](https://www.segmind.com/models/bria-mask-generator/llms.txt) | | Bria Prompt Enhancer | bria-prompt-enhancer | Bria AI generates high-quality, commercially safe images tailored to diverse creative needs. | Image-to-Text (Vision) | $0.018395061728395064 | 3.63199s | [llms.txt](https://www.segmind.com/models/bria-prompt-enhancer/llms.txt) | | Google Translate | google-translate | Translate effortlessly with the powerful Google Translation AI model. | Image-to-Text (Vision) | $0.005777320675105487 | 0.65903s | [llms.txt](https://www.segmind.com/models/google-translate/llms.txt) | | Ideogram Describe | ideogram-describe | Ideogram describe can effortlessly generate detailed prompts from images. Perfect for refining creations or replicating | Image-to-Text (Vision) | $0.015000000000000003 | 3.93242s | [llms.txt](https://www.segmind.com/models/ideogram-describe/llms.txt) | | Image Converter | image-converter | Convert images between formats instantly. | Image-to-Text (Vision) | $0.068 | 5.33057s | [llms.txt](https://www.segmind.com/models/image-converter/llms.txt) | | Image resizer | image-resizer | Resize images to any dimension quickly and precisely. | Image-to-Text (Vision) | $0.03333333333333333 | 3.76637s | [llms.txt](https://www.segmind.com/models/image-resizer/llms.txt) | | Json Extractor | json-extractor | Json Extractor | Image-to-Text (Vision) | $0.0001 | 0.00203s | [llms.txt](https://www.segmind.com/models/json-extractor/llms.txt) | | LLAVA 1.6 7B | llava-v1.6 | LLaVa translates images into text descriptions & captions. | Image-to-Text (Vision) | $0.005302737698586939 | 3.58934s | [llms.txt](https://www.segmind.com/models/llava-v1.6/llms.txt) | | Sam V2.1 Hiera Large | sam-v21-hiera-large | Meta's next-gen segmentation model for images and video. | Image-to-Text (Vision) | $0.036773737623762376 | 25.01516s | [llms.txt](https://www.segmind.com/models/sam-v21-hiera-large/llms.txt) | | Video Speed Change | video-speed-change | Speed up or slow down any video precisely. | Image-to-Text (Vision) | $0.042207800000000004 | 30.29047s | [llms.txt](https://www.segmind.com/models/video-speed-change/llms.txt) | ## Audio-to-Text (Transcription) | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Elevenlabs Dialogue With Timing | elevenlabs-dialogue-with-timestamps | Multi-speaker dialogue with expressive timestamps included. | Audio-to-Text (Transcription) | $0.01445625 | 2.49052s | [llms.txt](https://www.segmind.com/models/elevenlabs-dialogue-with-timestamps/llms.txt) | | Elevenlabs Forced Alignment | elevenlabs-forced-alignment | Precise audio-text synchronization with word-level timestamps. | Audio-to-Text (Transcription) | $0.09999999999999999 | 0.70002s | [llms.txt](https://www.segmind.com/models/elevenlabs-forced-alignment/llms.txt) | | Elevenlabs Transcript | eleven-labs-transcript | Transcribe audio to accurate text in 99 languages with speaker diarization and word-level timestamps. | Audio-to-Text (Transcription) | $0.0034717734729493893 | 7.72543s | [llms.txt](https://www.segmind.com/models/eleven-labs-transcript/llms.txt) | | Elevenlabs Voice Cloning | elevenlabs-voice-clone | Hyper-realistic voice cloning from short audio samples. | Audio-to-Text (Transcription) | $0.010000000000000002 | 4.67932s | [llms.txt](https://www.segmind.com/models/elevenlabs-voice-clone/llms.txt) | | Elevenlabs Voice Design | elevenlabs-voice-design | Generate unique synthetic voices without audio samples. | Audio-to-Text (Transcription) | $0.01 | 22.92416s | [llms.txt](https://www.segmind.com/models/elevenlabs-voice-design/llms.txt) | | TTS Elevenlabs With Timing | tts-elevenlabs-with-timestamps | Emotionally expressive TTS with word-level timestamp output. | Audio-to-Text (Transcription) | $0.05918507462686567 | 5.37361s | [llms.txt](https://www.segmind.com/models/tts-elevenlabs-with-timestamps/llms.txt) | ## audioToAudio | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Elevenlabs Audio Isolation | elevenlabs-audio-isolation | Extract clear speech from noisy audio and video. | audioToAudio | $0.13456178571428573 | 5.28191s | [llms.txt](https://www.segmind.com/models/elevenlabs-audio-isolation/llms.txt) | | Elevenlabs Speech To Speech | sts-eleven-labs | Eleven Labs Speech-to-Speech offers AI-powered voice conversion for content creators, media professionals, and anyone se | audioToAudio | $0.018750861111111114 | 6.45038s | [llms.txt](https://www.segmind.com/models/sts-eleven-labs/llms.txt) | ## videoToImage | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Frame extractor | frame-extractor | Extract individual frames from any video as images. | videoToImage | $0.0054555857142857146 | 26.50857s | [llms.txt](https://www.segmind.com/models/frame-extractor/llms.txt) | | Start & End Frame Extractor | start-end-frame-extractor | Extract first and last frames from any video. | videoToImage | $0.004668401041666667 | 4.92903s | [llms.txt](https://www.segmind.com/models/start-end-frame-extractor/llms.txt) | ## textToEmbed | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Text Embedding 3 Large | text-embedding-3-large | Text-embedding-3-large is a robust language model by OpenAI designed for generating high-dimensional text embeddings for | textToEmbed | $0.00001782294162415086 | 1.47092s | [llms.txt](https://www.segmind.com/models/text-embedding-3-large/llms.txt) | | Text Embedding 3 Small | text-embedding-3-small | Text-embedding-3-small is a compact and efficient model developed for generating high-quality text embeddings. These emb | textToEmbed | $0.000023989964637293316 | 1.25323s | [llms.txt](https://www.segmind.com/models/text-embedding-3-small/llms.txt) | ## imageTOImage | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Stable Diffusion 3 Medium Image to Image | sd3-med-img2img | Stable Diffusion 3 Medium image-to-image is a cutting-edge AI tool that uses advanced image-to-image technology to trans | imageTOImage | $0.0075827782666080075 | 7.43717s | [llms.txt](https://www.segmind.com/models/sd3-med-img2img/llms.txt) | ## Image Inpainting | Model | Slug | Description | Modality | Avg Cost | Avg Latency | API Docs | | --- | --- | --- | --- | --- | --- | --- | | Fooocus Inpainting | focus-inpaint | Fooocus Inpainting is a powerful image generation model that allows you to selectively edit and enhance images. | Image Inpainting | $0.024867879544198775 | 17.92299s | [llms.txt](https://www.segmind.com/models/focus-inpaint/llms.txt) | | SDXL Inpaint | sdxl-inpaint | This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting | Image Inpainting | $0.006894747349528049 | 8.54611s | [llms.txt](https://www.segmind.com/models/sdxl-inpaint/llms.txt) | | Stable Diffusion Inpainting | sd1.5-inpainting | Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given | Image Inpainting | $0.001819761720020576 | 2.87848s | [llms.txt](https://www.segmind.com/models/sd1.5-inpainting/llms.txt) | | Try-On Diffusion | try-on-diffusion | Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on | Image Inpainting | $0.011084928515494282 | 7.6227s | [llms.txt](https://www.segmind.com/models/try-on-diffusion/llms.txt) |