Insta Depth

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

Playground

Try the model in real time below.

loading...

Click or Drag-n-Drop

PNG, JPG or GIF, Up-to 2048 x 2048 px

loading...

Click or Drag-n-Drop

PNG, JPG or GIF, Up-to 2048 x 2048 px

output image


Examples

Check out what others have created with Insta Depth
Example preview

Photo of a woman wearing a Superman costume

seed: 354849415guidance_scale: 3

API

If you're looking for an API, you can choose from your desired programming language.

POST
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 import requests import base64 # Use this function to convert an image file from the filesystem to base64 def image_file_to_base64(image_path): with open(image_path, 'rb') as f: image_data = f.read() return base64.b64encode(image_data).decode('utf-8') # Use this function to fetch an image from a URL and convert it to base64 def image_url_to_base64(image_url): response = requests.get(image_url) image_data = response.content return base64.b64encode(image_data).decode('utf-8') api_key = "YOUR_API_KEY" url = "https://api.segmind.com/v1/insta-depth" # Request payload data = { "prompt": "Photo of a woman wearing a Superman costume", "negative_prompt": "lowquality, badquality, sketches", "face_image": image_url_to_base64("https://segmind-sd-models.s3.amazonaws.com/display_images/insta-depth-ip.png"), # Or use image_file_to_base64("IMAGE_PATH") "pose_image": image_url_to_base64("https://segmind-sd-models.s3.amazonaws.com/display_images/insta-dept-pose.png"), # Or use image_file_to_base64("IMAGE_PATH") "num_inference_steps": 10, "guidance_scale": 3, "seed": 354849415, "base64": False } headers = {'x-api-key': api_key} response = requests.post(url, json=data, headers=headers) print(response.content) # The response is the generated image
RESPONSE
image/jpeg
HTTP Response Codes
200 - OKImage Generated
401 - UnauthorizedUser authentication failed
404 - Not FoundThe requested URL does not exist
405 - Method Not AllowedThe requested HTTP method is not allowed
406 - Not AcceptableNot enough credits
500 - Server ErrorServer had some issue with processing

Attributes


promptstr *

Prompt to render


negative_promptstr ( default: None )

Prompts to exclude, eg. 'bad anatomy, bad hands, missing fingers'


face_imageimage *

Face Image.


pose_imageimage *

Pose Image.


num_inference_stepsint ( default: 10 ) Affects Pricing

Number of denoising steps.

min : 4,

max : 100


guidance_scalefloat ( default: 3 )

Scale for classifier-free guidance

min : 1,

max : 15


seedint ( default: -1 )

Seed for image generation.

min : -1,

max : 999999999999999


base64boolean ( default: 1 )

Base64 encoding of the output image.

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.


Pricing

Serverless Pricing

Buy credits that can be used anywhere on Segmind

$ 0.0038 /per second

Dedicated Cloud Pricing

For enterprise costs and dedicated endpoints

$ 0.0007 - $ 0.0031 /per second
FEATURES

PixelFlow allows you to use all these features

Unlock the full potential of generative AI with Segmind. Create stunning visuals and innovative designs with total creative control. Take advantage of powerful development tools to automate processes and models, elevating your creative workflow.

Segmented Creation Workflow

Gain greater control by dividing the creative process into distinct steps, refining each phase.

Customized Output

Customize at various stages, from initial generation to final adjustments, ensuring tailored creative outputs.

Layering Different Models

Integrate and utilize multiple models simultaneously, producing complex and polished creative results.

Workflow APIs

Deploy Pixelflows as APIs quickly, without server setup, ensuring scalability and efficiency.

Insta Depth

Insta Depth generates new images that very closely resemble a specific person and also allowing for different poses and angles. It achieves this using a single input image of the person and a text description of the desired variations, along with a reference image. The key aspect of Insta Depth involves transferring the composition of the person’s image into different poses based on the face image of the person. This ensures that the generated image maintains the unique identity of the person while allowing for variations in pose. This model is an add-on and improvement to InstantID model.

Key Components of Insta Depth

Insta Depth is a combination of Instant ID and ControlNet Depth models.

  1. ID Embedding: This part analyzes the input image to capture the person's unique facial features, like eye color, nose shape, etc. It focuses on these defining characteristics (semantic information) rather than the exact location of each feature on the face (spatial information).

  2. Lightweight Adapted Module: This module acts like an adapter, allowing the system to use the reference image itself as a visual prompt for the image generation process. The reference image can be any pose image.

  3. IdentityNet: This is where the actual image generation happens. It takes the information from the ID embedding (facial characteristics) and combines it with the text prompt to create a new image.

  4. ControlNet Depth enables composition transfer by understanding the depth of the input face image. It accurately preserves the person’s face in the new pose (reference image) in the output image.

How to use Insta Depth

  1. Input image: Provide a clear image of the person you want to generate variations for. This image is used to capture unique facial features and characteristics of the person.

  2. Pose Image: Upload a reference image that represents the pose you want the person in the input image to take. This could be any pose like standing, jumping, sitting, etc.

  3. Prompt: Provide a text prompt that describes the final output you envision. For example, if you want the person in the image to appear as if they’re wearing a Wonder Woman costume, your prompt could be “Photo of a woman wearing a Wonder Woman costume”.

F.A.Q.

Frequently Asked Questions

Take creative control today and thrive.

Start building with a free account or consult an expert for your Pro or Enterprise needs. Segmind's tools empower you to transform your creative visions into reality.

Pixelflow Banner