ESRGAN

ERGAN is an Image Super-Resolution (upscaler) model that enhances images with stunning, high-quality upscaling while preserving the exact composition of the original source. It improves detail without altering the image content.

Playground API Pricing

API

If you're looking for an API, you can choose from your desired programming language.

POST

import requests
import base64

# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
    with open(image_path, 'rb') as f:
        image_data = f.read()
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
    response = requests.get(image_url)
    image_data = response.content
    return base64.b64encode(image_data).decode('utf-8')

# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
    return [image_url_to_base64(url) for url in image_urls]

api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/esrgan"

# Request payload
data = {
  "image": image_url_to_base64("https://www.segmind.com/butterfly.png"),  # Or use image_file_to_base64("IMAGE_PATH")
  "scale": 2
}

headers = {'x-api-key': api_key}

response = requests.post(url, json=data, headers=headers)
print(response.content)  # The response is the generated image

RESPONSE

image/jpeg

HTTP Response Codes

200 - OKImage Generated

401 - UnauthorizedUser authentication failed

404 - Not FoundThe requested URL does not exist

405 - Method Not AllowedThe requested HTTP method is not allowed

406 - Not AcceptableNot enough credits

500 - Server ErrorServer had some issue with processing

Attributes

imageimage * Affects Pricing

Input Image.

scalestr ( default: 2 ) Affects Pricing

Scale of the output image

To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.

ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks

ESRGAN, or Enhanced Super-Resolution Generative Adversarial Networks, is a cutting-edge model designed to reconstruct high-resolution (HR) images or sequences from lower-resolution (LR) observations. This technology is particularly useful in upscaling images, for example, transforming a 720p image into a 1080p one. ESRGAN employs deep convolutional neural networks to recover HR images from LR ones, with the generator network learning to create realistic images and the discriminator network learning to differentiate between real and generated images. Through a process of competition and feedback, the generator network improves its ability to create high-quality images.

The technical architecture of ESRGAN is based on SRResNet, with residual-in-residual blocks. It uses a mixture of context, perceptual, and adversarial losses. Context and perceptual losses are used for proper image upscaling, while the adversarial loss pushes the neural network towards the natural image manifold. This is achieved using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.

One of the key advantages of ESRGAN is its ability to generate high-quality, realistic images from lower resolution inputs. This is achieved through the use of a generative adversarial network, which continually improves the quality of the generated images through a process of competition and feedback. This makes ESRGAN an excellent tool for a variety of applications where image quality is paramount.

ESRGAN use cases

Security Camera Image Enhancement: ESRGAN can be used to enhance low-quality images from security cameras, providing clearer images for identification or analysis.
Medical Imaging: The model can improve the resolution of medical images, aiding in more accurate diagnoses and treatments.
Model Output Upscaling: ESRGAN can upscale the outputs of stable diffusion or other models to a higher resolution and fidelity.
Printing: The model can be used to create high-resolution images or documents before printing, ensuring the highest possible print quality.
Digital Restoration: ESRGAN can be used in the digital restoration of old or damaged photos, enhancing the image quality and bringing new life to old memories.

ESRGAN license

The ESRGAN model is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License. This license allows for the free use, modification, and distribution of the software for non-commercial purposes only, provided that the original copyright notice and disclaimer are included in all copies or substantial portions of the software. However, it does not permit the sharing of adapted material.

The license also does not permit the use of the name of the license holder or the names of its contributors to endorse or promote products derived from this software without specific prior written permission. Furthermore, the license is irrevocable, meaning once granted, it cannot be taken back.

Other Popular Models

sadtalker

Audio-based Lip Synchronization for Talking Head Video

insta-depth

InstantID aims to generate customized images with various poses or styles from only a single reference ID image while ensuring high fidelity

sdxl1.0-txt2img

The SDXL model is the official upgrade to the v1.5 model. The model is released as open-source software

sd2.1-faceswapper

Take a picture/gif and replace the face in it with a face of your choice. You only need one image of the desired face. No dataset, no training