Stable Diffusion 3 Medium Text to Image
Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.
Simple, Transparent Pricing
Pay only for what you use. No hidden fees, no commitments.
Serverless
Pay-as-you-go pricing with credits that work across all Segmind models
Need more credits? Buy credits
Dedicated Cloud
Enterprise-grade dedicated endpoints with guaranteed capacity
per GPU second