Stable Diffusion 3 Medium Text to Image

Stable Diffusion is a type of latent diffusion model that can generate images from text. It was created by a team of researchers and engineers from CompVis, Stability AI, and LAION. Stable Diffusion v2 is a specific version of the model architecture. It utilizes a downsampling-factor 8 autoencoder with an 865M UNet and OpenCLIP ViT-H/14 text encoder for the diffusion model. When using the SD 2-v model, it produces 768x768 px images. It uses the penultimate text embeddings from a CLIP ViT-H/14 text encoder to condition the generation process.

Playground

Try the model in real time below.

output image


Examples

Check out what others have created with Stable Diffusion 3 Medium Text to Image
Example preview

A whimsical and high-resolution highly realistic image of a panda in a vintage cosmonaut suit. The panda is holding a sign that reads 'I love flying to the moon!' in playful lettering. The panda's helmet has a small propeller on top and a Indian flag patch, adding to the cosmic vibe. The background features a retro-styled spaceship with rockets and stars, giving the impression of a thrilling journey through space

seed: 468685guidance_scale: 5

API

If you're looking for an API, you can choose from your desired programming language.

POST
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 import requests import base64 # Use this function to convert an image file from the filesystem to base64 def image_file_to_base64(image_path): with open(image_path, 'rb') as f: image_data = f.read() return base64.b64encode(image_data).decode('utf-8') # Use this function to fetch an image from a URL and convert it to base64 def image_url_to_base64(image_url): response = requests.get(image_url) image_data = response.content return base64.b64encode(image_data).decode('utf-8') api_key = "YOUR_API_KEY" url = "https://api.segmind.com/v1/stable-diffusion-3-medium-txt2img" # Request payload data = { "prompt": "A whimsical and high-resolution highly realistic image of a panda in a vintage cosmonaut suit. The panda is holding a sign that reads 'I love flying to the moon!' in playful lettering. The panda's helmet has a small propeller on top and a Indian flag patch, adding to the cosmic vibe. The background features a retro-styled spaceship with rockets and stars, giving the impression of a thrilling journey through space", "negative_prompt": "bad quality, poor quality, doll, disfigured, jpg, toy, bad anatomy, missing limbs, missing fingers, 3d, cgi", "samples": 1, "scheduler": "DPM++ 2M", "num_inference_steps": 25, "guidance_scale": 5, "denoise": 1, "seed": 468685, "img_width": 1024, "img_height": 1024, "modelsamplingsd3_shift": 3, "conditioningsettimesteprange_start": 0.1, "conditioningsettimesteprange_stop": 1, "base64": False } headers = {'x-api-key': api_key} response = requests.post(url, json=data, headers=headers) print(response.content) # The response is the generated image
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


promptstr *

Prompt to render


negative_promptstr ( default: None )

Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'


samplesint ( default: 1 ) Affects Pricing

Number of samples to generate.

min : 1,

max : 4


schedulerenum:str ( default: DPM++ 2M )

Type of scheduler.

Allowed values:


num_inference_stepsint ( default: 25 ) Affects Pricing

Number of denoising steps.

min : 10,

max : 100


guidance_scalefloat ( default: 5 )

Scale for classifier-free guidance

min : 1,

max : 25


denoisefloat ( default: 1 )

How much to transform the reference image

min : 0.1,

max : 1


seedint ( default: -1 )

Seed for image generation.

min : -1,

max : 999999999999999


img_widthint ( default: 1024 ) Affects Pricing

Image width can be between 512 and 2048 in multiples of 8


img_heightint ( default: 1024 ) Affects Pricing

Image height can be between 512 and 2048 in multiples of 8


modelsamplingsd3_shiftint ( default: 3 ) Affects Pricing

Model Sampling SD3 Shift

min : 1,

max : 10


conditioningsettimesteprange_startfloat ( default: 0.1 ) Affects Pricing

Conditioning set timestep range start

min : 0.1,

max : 1


conditioningsettimesteprange_stopfloat ( default: 1 ) Affects Pricing

Conditioning set timestep range stop

min : 0.1,

max : 1


base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.


Pricing

Serverless Pricing

Buy credits that can be used anywhere on Segmind

$ 0.0015 /per second

Dedicated Cloud Pricing

For enterprise costs and dedicated endpoints

$ 0.0007 - $ 0.0031 /per second
FEATURES

PixelFlow allows you to use all these features

Unlock the full potential of generative AI with Segmind. Create stunning visuals and innovative designs with total creative control. Take advantage of powerful development tools to automate processes and models, elevating your creative workflow.

Segmented Creation Workflow

Gain greater control by dividing the creative process into distinct steps, refining each phase.

Customized Output

Customize at various stages, from initial generation to final adjustments, ensuring tailored creative outputs.

Layering Different Models

Integrate and utilize multiple models simultaneously, producing complex and polished creative results.

Workflow APIs

Deploy Pixelflows as APIs quickly, without server setup, ensuring scalability and efficiency.

Stable Diffusion 3 Medium Text-to-Image

Stable Diffusion 3 Medium Text-to-Image (SD3 Medium) is the latest and most advanced addition to the Stable Diffusion family of image-to-image models. SD3 text-to-image Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources. Due to its smaller size, SD3 Medium can run efficiently on consumer-grade hardware, including consumer PCs and laptops, as well as enterprise-tier GPUs. SD3 Medium is designed to be more resource-efficient, making it a better choice for users with limited computational resources.

Stable Diffusion 3 Medium Text-to-Image Capabilities

SD3 Medium crafts stunningly realistic images, breaking new ground in photorealistic generation. It also tackles intricate prompts with multiple subjects, even if you have a typo or two. SD3 Medium incorporates typography within your images with unparalleled precision, making your message shine.

F.A.Q.

Frequently Asked Questions

Take creative control today and thrive.

Start building with a free account or consult an expert for your Pro or Enterprise needs. Segmind's tools empower you to transform your creative visions into reality.

Pixelflow Banner