Chatterbox TTS Serverless API

Chatterbox transforms text into rich, natural speech with adjustable emotional expressiveness for diverse applications.

~18.08s
~$0.02
 1import requests
 2import json
 3
 4url = "https://api.segmind.com/v1/chatterbox-tts"
 5headers = {
 6    "x-api-key": "YOUR_API_KEY",
 7    "Content-Type": "application/json"
 8}
 9
10data = {
11    "text": "Welcome to Chatterbox TTS, where your text turns into captivating audio effortlessly.",
12    "reference_audio": "https://segmind-resources.s3.amazonaws.com/input/ef2a2b5c-3e3a-4051-a437-20a72bf175de-sample_audio.mp3",
13    "exaggeration": 0.5,
14    "temperature": 0.8,
15    "seed": 42,
16    "cfg_weight": 0.5,
17    "min_p": 0.05,
18    "top_p": 1,
19    "repetition_penalty": 1.2
20}
21
22response = requests.post(url, headers=headers, json=data)
23
24if response.status_code == 200:
25    result = response.json()
26    print(json.dumps(result, indent=2))
27else:
28    print(f"Error: {response.status_code}")
29    print(response.text)

API Endpoint

POSThttps://api.segmind.com/v1/chatterbox-tts

Parameters

textrequired
string

The input text is synthesized into speech. Use longer text for detailed narration, shorter for concise messages.

Default: "Welcome to Chatterbox TTS, where your text turns into captivating audio effortlessly."
cfg_weightoptional
number

Balances creativity and adherence to text. Use lower for strict interpretation, higher for flexibility.

Default: 0.5Range: 0 - 2
exaggerationoptional
number

Adjusts speech expressiveness. Use lower values for neutrality, higher for dramatic effect.

Default: 0.5Range: 0 - 2
min_poptional
number

Ensures minimum probability for content inclusion. Useful for removing unlikely phrases.

Default: 0.05Range: 0 - 1
reference_audiooptional
string (uri)

Provides a sample audio for voice style matching

Default: "https://segmind-resources.s3.amazonaws.com/input/ef2a2b5c-3e3a-4051-a437-20a72bf175de-sample_audio.mp3"
repetition_penaltyoptional
number

Penalizes repeated words in speech. Higher values reduce redundancy.

Default: 1.2Range: 1 - 2
seedoptional
integer

Ensures consistent output with the same input. Adjust for diverse generations.

Default: 42
temperatureoptional
number

Controls speech variation. Use lower for consistent tone, higher for diverse expressions.

Default: 0.8Range: 0 - 2
top_poptional
number

Determines output randomness. Lower for focused content, higher for creative diversity.

Default: 1Range: 0 - 1

Response Type

Returns: Audio

Common Error Codes

The API returns standard HTTP status codes. Detailed error messages are provided in the response body.

400

Bad Request

Invalid parameters or request format

401

Unauthorized

Missing or invalid API key

403

Forbidden

Insufficient permissions

404

Not Found

Model or endpoint not found

406

Insufficient Credits

Not enough credits to process request

429

Rate Limited

Too many requests

500

Server Error

Internal server error

502

Bad Gateway

Service temporarily unavailable

504

Timeout

Request timed out