1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 const axios = require('axios'); const fs = require('fs'); const path = require('path'); async function toB64(imgPath) { const data = fs.readFileSync(path.resolve(imgPath)); return Buffer.from(data).toString('base64'); } const api_key = "YOUR API-KEY"; const url = ""; const data = { "image": "toB64('')", "prompt": "top-view, A mouth-watering pizza topped with gooey cheese and fresh ingredients,Food Photography", "negative_prompt": "square, poorly drawn, low quality", "scheduler": "ddim", "num_inference_steps": 25, "guidance_scale": 7.5, "control_scale": 1.9, "control_start": 0.19, "control_end": 1, "samples": 1, "seed": 6130473235, "size": 768, "base64": false }; (async function() { try { const response = await, data, { headers: { 'x-api-key': api_key } }); console.log(; } catch (error) { console.error('Error:',; } })();
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing


imageimage *

Input Image

promptstr *

Prompt to render

negative_promptstr ( default: None )

Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'

schedulerenum:str ( default: ddim )

Type of scheduler.

Allowed values:

num_inference_stepsint ( default: 25 ) Affects Pricing

Number of denoising steps.

min : 25,

max : 100

guidance_scalefloat ( default: 7.5 )

Scale for classifier-free guidance

min : 0.1,

max : 25

control_scalefloat ( default: 1.8 )

Scale for controlnet conditioning scale

min : 0,

max : 5

control_startfloat ( default: 0.19 )

Scale for controlnet guidance start

min : 0.01,

max : 1

control_endfloat ( default: 1 )

Scale for controlnet guidance end

min : 0.01,

max : 1

samplesint ( default: 1 ) Affects Pricing

Number of samples to generate.

min : 1,

max : 4

seedint ( default: -1 )

Seed for image generation.

sizeenum:int ( default: 1 ) Affects Pricing

Image resolution.

Allowed values:

base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

Segmind Stable Diffusion Word to Image

Meet Segmind Stable Diffusion Word to Image, an AI-powered artwork generator that merges the worlds of language and visual art. This innovative model comprehends human language and sentiment, transforming words and descriptions into extraordinary art pieces, each unique and reflecting the inspiration behind it. By simply providing a keyword or phrase coupled with a brief description, users can witness the birth of their textual ideas in the form of distinct, visually striking art pieces. Whether it's custom apparel designs, personalized home decor, book covers, or advertisement visuals, the scope of this model transcends traditional design limitations, making it an indispensable tool in various creative fields.

The technical backbone of this model is the Stable Diffusion 1.5 ControlNet. Essentially, it's trained on pairs of images that include words and various forms of art, with an architectural extension to the UNET module. ControlNet, the unique neural network structure at the heart of the model, bolsters diffusion models by introducing extra conditions. It replicates the weights of neural network blocks into a "locked" copy and a "trainable" copy, where the trainable copy learns your condition and the locked copy preserves the initial model. Thus, large diffusion models such as Stable Diffusion can be enhanced with ControlNets to accommodate conditional inputs like edge maps, segmentation maps, keypoints, etc.

The Segmind Stable Diffusion Word to Image model shatters the mold of traditional design techniques, offering unparalleled flexibility and creative possibilities. It enables the generation of visually stunning and artistically satisfying outputs from text inputs, allowing both individuals and businesses to visualize their ideas or missions uniquely. By offering an intuitive understanding of language and art, the model takes creativity to an unexplored frontier, proving AI's infinite creative potential. The model's versatility extends its usability beyond the conventional, with potential applications in various domains such as fashion, interior design, publishing, and advertising.

Segmind Stable Diffusion Word to Image use cases

  1. Custom Apparel: A company could offer a service where customers input a word or phrase and a brief description, then receive a unique, custom-designed piece of clothing. For instance, a customer could input "love" and "beautiful floral design," resulting in a unique print that could be used on a T-shirt, hoodie, or hat.

  2. Personalized Home Decor: This could be a great tool for creating custom art pieces for the home. A customer could input their family's last name and a description of their home's color scheme or style, and the AI would generate a piece of art to match.

  3. Greeting Cards: Customers could create their own custom greeting cards. They could input a word like "Birthday" and a brief description such as "colorful balloons," and the AI would generate a unique card design.

  4. Event Planning: For events such as weddings or birthdays, the model could be used to create personalized decorations. The names of the couple or the birthday person could be incorporated into beautiful designs fitting the event's theme.

  5. Book Cover Design: An author could input the title of their book and a brief description of the book's theme to generate a unique cover design.

  6. Restaurant Menus: A restaurant could use the model to create a unique menu. They could input the name of a dish and a description of its flavors to generate a corresponding visual.

  7. Advertising: Companies could generate unique, eye-catching visuals for their ad campaigns. They could input their product's name and a brief description of its benefits or features to create an appealing design.

  8. Website Design: Web developers could use this model to generate unique visuals for a website. They could input the name of the company and a brief description of the company's values or mission to create a corresponding visual.

  9. Education: Teachers could use this tool to create educational materials. They could input a keyword from the lesson and a brief description of the concept to create a visual aid.

  10. Tattoo Designs: A tattoo artist could use this model to generate unique designs based on customer's input. For example, a customer might input a word that has significant meaning to them and a description of the style they want.


The Segmind Stable Diffusion Word to Image model comes with the CreativeML Open RAIL-M license. The license promotes the widespread adoption of multimodal generative models while also addressing potential ethical considerations and misuse. Drawing inspiration from open-source permissive licenses, this license allows for the open and responsible use of the model. It imposes certain use-based restrictions to prevent misuse, encouraging responsible use in the field of AI. While derivatives of the model may be released under different licensing terms, they must always include the same use-based restrictions as those in the original license. This balance between openness and responsibility aims to foster responsible open-science in the AI field. The license governs the model's use (and its derivatives), guided by the model card associated with the model.