Chatterbox TTS Serverless API

Chatterbox transforms text into rich, natural speech with adjustable emotional expressiveness for diverse applications.

~18.08s
POST /v2/chatterbox-tts · submit + poll
 1# pip install "segmind>=1.1.0"
 2# export SEGMIND_API_KEY="YOUR_API_KEY"
 3import segmind
 4
 5# Async (v2): submit to the queue and block until COMPLETED.
 6# run() returns the final result dict (600s deadline, 1.0s poll by default).
 7result = segmind.run(
 8    "chatterbox-tts",
 9    text="Welcome to Chatterbox TTS, where your text turns into captivating audio effortlessly.",
10    reference_audio="https://segmind-resources.s3.amazonaws.com/input/ef2a2b5c-3e3a-4051-a437-20a72bf175de-sample_audio.mp3",
11    exaggeration=0.5,
12    temperature=0.8,
13    seed=42,
14    cfg_weight=0.5,
15    min_p=0.05,
16    top_p=1,
17    repetition_penalty=1.2,
18)
19print(result["status"])                      # COMPLETED
20print(result.get("output"))                  # model output (e.g. media URL)
21print(result["metrics"]["inference_time"])   # server compute seconds
22
23# --- Or submit + poll manually (track request_id, control the cadence) ---
24from segmind import SegmindClient, InferenceFailed, InferenceTimeout
25
26client = SegmindClient()                      # reads SEGMIND_API_KEY
27payload = {
28    "text": "Welcome to Chatterbox TTS, where your text turns into captivating audio effortlessly.",
29    "reference_audio": "https://segmind-resources.s3.amazonaws.com/input/ef2a2b5c-3e3a-4051-a437-20a72bf175de-sample_audio.mp3",
30    "exaggeration": 0.5,
31    "temperature": 0.8,
32    "seed": 42,
33    "cfg_weight": 0.5,
34    "min_p": 0.05,
35    "top_p": 1,
36    "repetition_penalty": 1.2,
37}
38job = client.submit_async("chatterbox-tts", **payload)
39print(job.request_id)                         # available immediately
40try:
41    result = job.wait(timeout=600, interval=1.0)
42except InferenceTimeout as e:
43    print("still running:", e.request_id)
44except InferenceFailed as e:
45    print("failed:", e.detail)

API Endpoint

POSThttps://api.segmind.com/v1/chatterbox-tts

Parameters

textrequired
string

The input text is synthesized into speech. Use longer text for detailed narration, shorter for concise messages.

Default: "Welcome to Chatterbox TTS, where your text turns into captivating audio effortlessly."
cfg_weightoptional
number

Balances creativity and adherence to text. Use lower for strict interpretation, higher for flexibility.

Default: 0.5Range: 0 - 2
exaggerationoptional
number

Adjusts speech expressiveness. Use lower values for neutrality, higher for dramatic effect.

Default: 0.5Range: 0 - 2
min_poptional
number

Ensures minimum probability for content inclusion. Useful for removing unlikely phrases.

Default: 0.05Range: 0 - 1
reference_audiooptional
string (uri)

Provides a sample audio for voice style matching

Default: "https://segmind-resources.s3.amazonaws.com/input/ef2a2b5c-3e3a-4051-a437-20a72bf175de-sample_audio.mp3"
repetition_penaltyoptional
number

Penalizes repeated words in speech. Higher values reduce redundancy.

Default: 1.2Range: 1 - 2
seedoptional
integer

Ensures consistent output with the same input. Adjust for diverse generations.

Default: 42
temperatureoptional
number

Controls speech variation. Use lower for consistent tone, higher for diverse expressions.

Default: 0.8Range: 0 - 2
top_poptional
number

Determines output randomness. Lower for focused content, higher for creative diversity.

Default: 1Range: 0 - 1

Response Type

Returns: Audio

Asynchronous requests (v2)

Use Async for video, long-running (>~60s), or high-concurrency workloads; Sync is simplest for fast image & LLM calls. Async submits a request and you poll it to completion.

  1. 1
    POST /v2/chatterbox-tts

    Submitreturns request_id, status_url, response_url

  2. 2
    GET /v2/requests/{id}/status

    Polluntil COMPLETED or FAILED

  3. 3
    GET /v2/requests/{id}

    Resultfinal response body

Status states

QUEUEDAccepted, waiting for a worker
PROCESSINGRunning on a worker
COMPLETEDDone — result body is ready
FAILEDErrored (incl. content/RAI blocks)
  • A FAILED request is served as HTTP 422 — the body still carries the error detail.
  • An unknown or expired request_id returns HTTP 404.
  • Results are retained for 1 hour, then expire.
  • Content / RAI blocks surface as FAILED, not a separate state.
  • Track completion by polling the status endpoint.

Common Error Codes

The API returns standard HTTP status codes. Detailed error messages are provided in the response body.

400

Bad Request

Invalid parameters or request format

401

Unauthorized

Missing or invalid API key

403

Forbidden

Insufficient permissions

404

Not Found

Model or endpoint not found

406

Insufficient Credits

Not enough credits to process request

429

Rate Limited

Too many requests

500

Server Error

Internal server error

502

Bad Gateway

Service temporarily unavailable

504

Timeout

Request timed out