Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

~7.19s

~$0.039

Playground API Pricing

~7.19s

~$0.039

Simple, Transparent Pricing

Pay only for what you use. No hidden fees, no commitments.

Serverless

Pay-as-you-go pricing with credits that work across all Segmind models

$ 0.006

/per gpu second

No upfront costs - Only pay for what you use

Auto-scaling - Handles traffic spikes automatically

Universal credits - Use anywhere on Segmind

Instant deployment - Start using immediately

Need more credits? Buy credits

Dedicated Cloud

Enterprise-grade dedicated endpoints with guaranteed capacity

$0.0007- $0.0031

per GPU second

Dedicated GPU - Reserved capacity for your workloads

Lower latency - Faster response times

Custom endpoints - Personalized API endpoints

Enterprise support - Priority assistance