1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/kandinsky2.2-txt2img"
# Request payload
data = {
"prompt": "masterpiece, best quality, portrait of an old man, 50mm, solo, natural skin texture, realistic eye and face details, dark, deep shadow, darkness, moonlight, award winning photo, extremely detailed, fine detail, highly detailed, extremely detailed eyes and face, piercing red eyes, detailed clothes, skinny, gothic, native american clothing, analog film, stock photograph,",
"negative_prompt": "lowres, text, error, cropped, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands",
"samples": 1,
"num_inference_steps": 25,
"img_width": 512,
"img_height": 768,
"prior_steps": 25,
"seed": 9863172,
"base64": False
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Prompt to render
Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'
Number of samples to generate.
min : 1,
max : 4
Number of denoising steps.
min : 20,
max : 100
Image resolution.
Allowed values:
Image resolution.
Allowed values:
Number of denoising steps.
min : 1,
max : 100
Seed for image generation.
Base64 encoding of the output image.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Kandinsky 2.2, a groundbreaking advancement over its predecessor, Kandinsky 2.1. With the integration of the powerful CLIP-ViT-G image encoder and the innovative ControlNet support, Kandinsky 2.2 is set to redefine the boundaries of aesthetic image creation and text comprehension.
At the heart of Kandinsky 2.2 lies the state-of-the-art CLIP-ViT-G image encoder, a transformative addition that amplifies the model's ability to craft visually stunning images while enhancing its text understanding capabilities. Complementing this is the ControlNet mechanism, a strategic inclusion designed to offer users unparalleled control over the image generation process.
Enhanced Image Aesthetics: The CLIP-ViT-G encoder ensures the generation of visually richer and more captivating images.
Superior Text Understanding: With the new encoder, the model boasts an improved comprehension of text, bridging the gap between textual prompts and visual outputs.
Precision Control: The ControlNet support empowers users to guide the image generation process, ensuring outputs that align with their vision.
Optimized Performance: The combined power of CLIP-ViT-G and ControlNet results in a significant boost in the model's overall performance.
Digital Art Creation: Artists can harness Kandinsky 2.2 to craft digital artworks that resonate with depth and detail.
Content Generation: Ideal for content creators seeking to generate visuals based on textual prompts or narratives.
Interactive Design: Designers can iteratively shape their designs, making real-time adjustments guided by text.
Educational Tools: Can be integrated into learning platforms, allowing students to explore the interplay between text and visuals.
Gaming and AR: Enhance user immersion in games or AR experiences by generating visuals based on in-game narratives or user prompts.
Kandinsky 2.2's permissive license ensures that users, be they individual creators, businesses, or developers, can utilize the model for a myriad of commercial purposes without the constraints typically associated with restrictive licenses.