Chatterbox API - AI Text to Speech APIs
by Resemble-ai
Chatterbox API, developers can convert text into lifelike audio with customizable voice characteristics. The API supports multiple languages and speaking styles, making it ideal for voiceovers, audiobooks, virtual assistants, and accessibility applications requiring human-quality speech output.

Models Version
LIMITED TIME OFFER
Get $5 Free Credit on First Payment
No strings attached — add funds and get $5 bonus instantly
Chatterbox v1 Text to Speech API Documentation
https://gateway.pixazo.ai/chatterbox-text-to-speech/v1
Authentication
All requests require an API key passed via header.
| Header | Type | Required | Description |
|---|---|---|---|
| Ocp-Apim-Subscription-Key | string | Yes | Your API subscription key |
Chatterbox Text to Speech generate request - Chatterbox Text to Speech API
Request Code
POST https://gateway.pixazo.ai/chatterbox-text-to-speech/v1/chatterbox-text-to-speech-request
Content-Type: application/json
Cache-Control: no-cache
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY
{
"text": "Hello world, this is a test of the Chatterbox text to speech model.",
"audio_url": "https://storage.googleapis.com/chatterbox-demo-samples/prompts/male_rickmorty.mp3",
"exaggeration": 0.25,
"temperature": 0.7,
"cfg": 0.5
}
import requests
url = "https://gateway.pixazo.ai/chatterbox-text-to-speech/v1/chatterbox-text-to-speech-request"
headers = {
"Content-Type": "application/json",
"Cache-Control": "no-cache",
"Ocp-Apim-Subscription-Key": "YOUR_SUBSCRIPTION_KEY"
}
data = {
"text": "Hello world, this is a test of the Chatterbox text to speech model.",
"audio_url": "https://storage.googleapis.com/chatterbox-demo-samples/prompts/male_rickmorty.mp3",
"exaggeration": 0.25,
"temperature": 0.7,
"cfg": 0.5
}
response = requests.post(url, json=data, headers=headers)
print(response.json())
const url = 'https://gateway.pixazo.ai/chatterbox-text-to-speech/v1/chatterbox-text-to-speech-request';
const data = {
text: 'Hello world, this is a test of the Chatterbox text to speech model.',
audio_url: 'https://storage.googleapis.com/chatterbox-demo-samples/prompts/male_rickmorty.mp3',
exaggeration: 0.25,
temperature: 0.7,
cfg: 0.5
};
fetch(url, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Cache-Control': 'no-cache',
'Ocp-Apim-Subscription-Key': 'YOUR_SUBSCRIPTION_KEY'
},
body: JSON.stringify(data)
})
.then(response => response.json())
.then(data => console.log(data))
.catch(error => console.error('Error:', error));
curl -X POST "https://gateway.pixazo.ai/chatterbox-text-to-speech/v1/chatterbox-text-to-speech-request" \
-H "Content-Type: application/json" \
-H "Cache-Control: no-cache" \
-H "Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY" \
--data-raw '{
"text": "Hello world, this is a test of the Chatterbox text to speech model.",
"audio_url": "https://storage.googleapis.com/chatterbox-demo-samples/prompts/male_rickmorty.mp3",
"exaggeration": 0.25,
"temperature": 0.7,
"cfg": 0.5
}'
Output
{
"request_id": "chatterbox-text-to-speech_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"status": "QUEUED",
"polling_url": "https://gateway.pixazo.ai/v2/requests/status/chatterbox-text-to-speech_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Webhook (Optional)
Add the X-Webhook-URL header to your generate request to receive a POST callback instead of polling.
X-Webhook-URL: https://your-server.com/webhook/callback
Request Parameters - Chatterbox Text to Speech generate request
| Parameter | Required | Type | Description |
|---|---|---|---|
| text | Yes | string | The textual content to convert into speech. Must be a valid string of readable language. |
| audio_url | No | string | A URL pointing to an audio file (e.g., MP3) to serve as a voice reference. Used to clone or adapt the speaking style. |
| exaggeration | No | number | Controls the degree of expressive emphasis in the generated speech. Higher values increase modulation (e.g., intonation, stress). Range: 0.0 to 1.0. |
| temperature | No | number | Controls randomness in voice generation. Higher values increase variability in pitch and timing; lower values produce more consistent, predictable speech. Range: 0.1 to 1.0. |
| cfg | No | number | Classifier-Free Guidance strength. Influences how closely the output adheres to the input prompt and reference audio. Higher values increase fidelity. Range: 0.0 to 2.0. |
Example Request
{
"text": "Hello world, this is a test of the Chatterbox text to speech model.",
"audio_url": "https://storage.googleapis.com/chatterbox-demo-samples/prompts/male_rickmorty.mp3",
"exaggeration": 0.25,
"temperature": 0.7,
"cfg": 0.5
}
Response
{
"request_id": "chatterbox-text-to-speech_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"status": "QUEUED",
"polling_url": "https://gateway.pixazo.ai/v2/requests/status/chatterbox-text-to-speech_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
}
Request Headers
| Header | Value |
|---|---|
| Content-Type | application/json |
| Cache-Control | no-cache |
| Ocp-Apim-Subscription-Key | YOUR_SUBSCRIPTION_KEY |
Response Handling
Common status codes.
| Code | Meaning |
|---|---|
| 202 | Accepted — Request queued |
| 400 | Bad Request |
| 401 | Unauthorized |
| 402 | Insufficient Balance |
| 403 | Forbidden |
| 429 | Too Many Requests |
| 500 | Internal Server Error |
Error Responses
Queue system errors and model validation errors.
Queue System Errors
// 402 — Insufficient balance
{
"error": "Insufficient Balance",
"message": "Your wallet does not have enough balance. Required: $0.01"
}
// 400 — Model not found
{
"error": "Model not found",
"message": "Model 'chatterbox-text-to-speech' not found or is disabled"
}
Error via Status/Webhook
{
"request_id": "chatterbox-text-to-speech_019dxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
"status": "ERROR",
"model_id": "chatterbox-text-to-speech",
"error": "Description of the error",
"output": null
}
Retrieving Results
Poll the universal status endpoint to check progress and retrieve results.
Endpoint
GET https://gateway.pixazo.ai/v2/requests/status/{request_id}
Ocp-Apim-Subscription-Key: YOUR_API_KEY
cURL Example
curl -H "Ocp-Apim-Subscription-Key: YOUR_API_KEY" \
"https://gateway.pixazo.ai/v2/requests/status/chatterbox-text-to-speech_019d42ce-bc92-7f98-8181-b42db433b9f2e"
Response (Completed)
{
"request_id": "chatterbox-text-to-speech_019d42ce-bc92-7f98-8181-b42db433b9f2e",
"status": "COMPLETED",
"model_id": "chatterbox-text-to-speech",
"error": null,
"output": {
"media_url": [
"https://pub-582b7213209642b9b995c96c95a30381.r2.dev/v1/chatterbox-text-to-speech_019d42ce-bc92-7f98-8181-b42db433b9f2e/output.wav"
],
"media_type": "audio/wav"
},
"created_at": "2026-03-31T07:32:03.749Z",
"updated_at": "2026-03-31T07:32:20.000Z",
"completed_at": "2026-03-31T07:32:20.000Z"
}
Response Fields
| Field | Type | Description |
|---|---|---|
| request_id | string | Unique request identifier |
| status | string | QUEUED, PROCESSING, COMPLETED, FAILED, or ERROR |
| model_id | string | Model that processed the request |
| error | string|null | Error message if failed |
| output.media_url | array | URLs to generated media (R2 CDN) |
| output.media_type | string | MIME type (audio/wav) |
| created_at | string | When request was created |
| completed_at | string|null | When request completed |
| polling_url | string | Status URL (initial response only) |
Status Values
| Status | Description |
|---|---|
| QUEUED | Request accepted, waiting to be processed |
| PROCESSING | Being processed by the model |
| COMPLETED | Done — output contains the result |
| FAILED | Failed — check error field |
| ERROR | System error — not charged |
Status Flow
QUEUED → PROCESSING → COMPLETED
→ FAILED
→ ERROR
Typical Workflow
- Send a generate request to the API endpoint
- Save the
request_idfrom the response - Poll every 5-10 seconds:
GET /v2/requests/status/{request_id} - When
statusis"COMPLETED", download fromoutput.media_url
Tip: Use X-Webhook-URL header to get a callback instead of polling.
Chatterbox v1 Text to Speech API Pricing
| Resolution | Price (USD) |
|---|---|
| All Resolution | $0.03 |