Sync.so Lipsync 2 Pro
Lipsync-2-Pro seamlessly synchronizes lips in videos for instant, high-quality multilingual content creation.
API
If you're looking for an API, you can choose from your desired programming language.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import requests
import base64
# Use this function to convert an image file from the filesystem to base64
def image_file_to_base64(image_path):
with open(image_path, 'rb') as f:
image_data = f.read()
return base64.b64encode(image_data).decode('utf-8')
# Use this function to fetch an image from a URL and convert it to base64
def image_url_to_base64(image_url):
response = requests.get(image_url)
image_data = response.content
return base64.b64encode(image_data).decode('utf-8')
# Use this function to convert a list of image URLs to base64
def image_urls_to_base64(image_urls):
return [image_url_to_base64(url) for url in image_urls]
api_key = "YOUR_API_KEY"
url = "https://api.segmind.com/v1/sync.so-lipsync-2-pro"
# Request payload
data = {
"video_url": "https://segmind-resources.s3.amazonaws.com/output/a741b039-226c-43c2-9bd0-c301f058d314-UntitledVideo-ezgif.com-crop-video.mp4",
"audio_url": "https://segmind-resources.s3.amazonaws.com/output/80e96316-7e75-4733-b80c-049a0a6787cb-c9f17960-96b5-4119-8b7e-4ae0c9f21e2f-audio-AudioTrimmer.com-AudioTrimmer.com.mp3",
"sync_mode": "loop",
"temperature": 0.5,
"auto_active_speaker_detection": True,
"occlusion_detection_enabled": False
}
headers = {'x-api-key': api_key}
response = requests.post(url, json=data, headers=headers)
print(response.content) # The response is the generated image
Attributes
Provides the video URL for synchronization. Use high-quality links for best results.
Provides the audio URL for synchronization. Use clear audio files for precision.
Manages video-audio mismatch. Use 'loop' for repetitive audio, 'cut_off' for trimming.
Allowed values:
Controls expression in lip sync. Use 0.3 for calm, 0.8 for dynamic expressions.
min : 0,
max : 1
Detects and syncs active speaker automatically. Enable for multi-speaker scenarios.
Detects occlusion, slowing generation. Disable for faster processing.
To keep track of your credit usage, you can inspect the response headers of each API call. The x-remaining-credits property will indicate the number of remaining credits in your account. Ensure you monitor this value to avoid any disruptions in your API usage.
Resources to get you started
Everything you need to know to get the most out of Sync.so Lipsync 2 Pro
Lipsync-2-Pro: Effective Usage Guide
Lipsync-2-Pro is a cutting-edge AI video lip synchronization model by Sync Labs. It delivers professional, diffusion-based 4K output with no per-speaker training. Whether you’re localizing dialogue, updating corporate training videos, or animating virtual characters, this guide helps you choose the right parameters and workflows for consistent, natural results.
1. Getting Started
- •Prepare high-quality inputs:
- •video_url: Use a clear, high-resolution link (e.g., MP4 or MOV up to 4K).
- •audio_url: Provide an isolated, noise-free dialogue track (e.g., MP3/WAV).
- •Set the base parameters:
- •sync_mode:
"loop"
(default) - •temperature:
0.5
- •auto_active_speaker_detection:
true
- •occlusion_detection_enabled:
false
- •sync_mode:
2. Parameter Recommendations
• Sync Mode
– loop: Repetitive audio segments (e.g., theme recaps)
– bounce: Continuous speech with slight overlap
– cut_off: Precise cuts for ADR/dubbing
– silence: Remove lip motion during pauses
– remap: Complex re-timing across scenes
• Temperature (0–1)
– 0.3: Subtle, professional delivery (corporate, e-learning)
– 0.5: Balanced, natural sync (film/TV dubbing)
– 0.8: Expressive, dynamic dialogue (gaming, animation)
• Auto Active Speaker Detection
– true: Scenes with multiple speakers or panel discussions
– false: Single actor or narrator footage
• Occlusion Detection
– true: When hands/objects frequently cover the mouth (longer processing)
– false: Clean, unobstructed facial footage (faster output)
3. Use Case Examples
- •Film & TV Dubbing
sync_mode:cut_off
temperature:0.5
auto_active_speaker_detection:true
- •Podcast Video Localization
sync_mode:loop
temperature:0.4
occlusion_detection_enabled:false
- •Gaming Cutscenes
sync_mode:bounce
temperature:0.7
- •Corporate Training Updates
sync_mode:cut_off
temperature:0.3
- •Virtual Character Animation
sync_mode:remap
temperature:0.8
4. Tips for Optimal Output
- •Always start with clean, synced audio and high-resolution video.
- •Preview short clips to fine-tune temperature and sync_mode.
- •Enable active speaker detection for multi-speaker or panel footage.
- •Disable occlusion detection to accelerate turnaround when precision isn’t critical.
- •Experiment with “bounce” for natural transitions in dynamic scenes.
By following these guidelines and adjusting parameters based on your project’s style and complexity, Lipsync-2-Pro will deliver seamless, high-fidelity lip synchronization—every time.
Other Popular Models
Discover other models you might be interested in.
fooocus
Fooocus enables high-quality image generation effortlessly, combining the best of Stable Diffusion and Midjourney.

sdxl-inpaint
This model is capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask

codeformer
CodeFormer is a robust face restoration algorithm for old photos or AI-generated faces.

esrgan
ERGAN is an Image Super-Resolution (upscaler) model that enhances images with stunning, high-quality upscaling while preserving the exact composition of the original source. It improves detail without altering the image content.
