Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Playground

History

API Pricing

~7.39s

Example output Default output example

Stable Diffusion 3 Medium Text-to-Image

Stable Diffusion 3 Medium Text-to-Image (SD3 Medium) is the latest and most advanced addition to the Stable Diffusion family of image-to-image models. SD3 text-to-image Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources. Due to its smaller size, SD3 Medium can run efficiently on consumer-grade hardware, including consumer PCs and laptops, as well as enterprise-tier GPUs. SD3 Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources.

Stable Diffusion 3 Medium Text-to-Image Capabilities

SD3 Medium crafts stunningly realistic images, breaking new ground in photorealistic generation. It also tackles intricate prompts with multiple subjects, even if you have a typo or two. SD3 Medium incorporates typography within your images with unparalleled precision, making your message shine.

Popular Models

Discover other powerful AI models

Image To Image

Stable Diffusion 3 Medium Text to Image

Stable Diffusion 3 Medium Text-to-Image

Stable Diffusion 3 Medium Text-to-Image Capabilities

Popular Models

Seedream 4.0 (4k)

Nano Banana

Wan 2.2 Image to Video Fast

Kling 2.1 AI Video Generator