Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

~7.19s
~$0.039

Simple, Transparent Pricing

Pay only for what you use. No hidden fees, no commitments.

Serverless

Pay-as-you-go pricing with credits that work across all Segmind models

$ 0.006
/per gpu second
No upfront costs - Only pay for what you use
Auto-scaling - Handles traffic spikes automatically
Universal credits - Use anywhere on Segmind
Instant deployment - Start using immediately

Need more credits? Buy credits

Dedicated Cloud

Enterprise-grade dedicated endpoints with guaranteed capacity

$0.0007- $0.0031

per GPU second

Dedicated GPU - Reserved capacity for your workloads
Lower latency - Faster response times
Custom endpoints - Personalized API endpoints
Enterprise support - Priority assistance