Advanced Parameters Guide
This guide provides detailed information about the advanced parameters available in the Uberduck API text-to-speech endpoint. These parameters allow you to fine-tune the speech output to meet your specific requirements.
Parameter Categories
The Uberduck API organizes parameters into three categories:
- Core Parameters - Essential parameters that are required by all text-to-speech requests
- Extended Parameters - Common parameters supported by many different models
- Model-Specific Parameters - Parameters that are unique to specific models or providers
Core Parameters
These parameters are required for all text-to-speech requests:
Parameter | Type | Required | Description |
---|---|---|---|
text | string | Yes | The text to convert to speech |
voice | string | Yes | The voice ID to use |
model | string | No | The model ID to use (if not specified, a default compatible model will be selected) |
Extended Parameters
Extended parameters are common across many different models:
Speed Control
// Example: Adjust speech speed
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken at a faster rate.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
speed: 1.5 // 50% faster than normal
}
})
});
Parameter | Type | Range | Default | Description |
---|---|---|---|---|
speed | float | 0.5 - 2.0 | 1.0 | Speech rate multiplier. Values > 1 increase speed, values < 1 decrease speed. |
Pitch Adjustment
// Example: Adjust voice pitch
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken with a higher pitch.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
pitch: 1.5 // Higher pitch
}
})
});
Parameter | Type | Range | Default | Description |
---|---|---|---|---|
pitch | float | -10.0 - 10.0 | 0.0 | Voice pitch adjustment. Positive values increase pitch, negative values decrease pitch. |
Emotion Control
Some models support emotional expressions:
// Example: Apply emotion to speech
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken with a happy emotion.',
voice: 'voice_id',
model: 'model_with_emotion_support',
extended: {
emotion: 'happy'
}
})
});
Parameter | Type | Options | Default | Description |
---|---|---|---|---|
emotion | string | 'happy', 'sad', 'angry', 'neutral', etc. | 'neutral' | Emotional tone to apply to the speech |
Model-Specific Parameters
Different models and providers support unique parameters:
AWS Polly
// Example: AWS Polly-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses AWS Polly-specific parameters.',
voice: 'polly_joanna',
model: 'polly_neural',
model_specific: {
engine: 'neural', // 'neural' or 'standard'
voice_style: 'newscaster' // For specific voices that support styles
}
})
});
Parameter | Type | Options | Default | Description |
---|---|---|---|---|
engine | string | 'neural', 'standard' | Depends on voice | The Polly engine to use |
voice_style | string | 'newscaster', etc. | - | Style for voices that support it |
Google Cloud TTS
// Example: Google Cloud TTS-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses Google Cloud TTS-specific parameters.',
voice: 'google_en-US-Neural2-F',
model: 'google_neural2',
model_specific: {
speaking_rate: 0.85, // Range: 0.25 to 4.0
pitch: 2.0, // Range: -20.0 to 20.0
volume_gain_db: 3.0 // Volume adjustment in dB
}
})
});
Parameter | Type | Range | Default | Description |
---|---|---|---|---|
speaking_rate | float | 0.25 - 4.0 | 1.0 | Speaking rate |
pitch | float | -20.0 - 20.0 | 0.0 | Voice pitch |
volume_gain_db | float | -96.0 - 16.0 | 0.0 | Volume adjustment in dB |
Azure Speech Service
// Example: Azure Speech Service-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses Azure Speech Service-specific parameters.',
voice: 'azure_en-US-JennyNeural',
model: 'azure_neural',
model_specific: {
style: 'cheerful', // Styles vary by voice
style_degree: 1.5, // Emphasis of the style (0.5-2.0)
role: 'YoungAdultFemale' // Role playing for the voice
}
})
});
Parameter | Type | Options/Range | Default | Description |
---|---|---|---|---|
style | string | 'cheerful', 'sad', etc. | - | Speaking style |
style_degree | float | 0.5 - 2.0 | 1.0 | Intensity of the style |
role | string | 'YoungAdultFemale', etc. | - | Character role |
Output Format
You can specify the desired output format for the generated audio:
// Example: Specify output format
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This will be returned as a WAV file.',
voice: 'polly_joanna',
model: 'polly_neural',
output_format: 'wav' // Options: 'mp3', 'wav', 'ogg'
})
});
Parameter | Type | Options | Default | Description |
---|---|---|---|---|
output_format | string | 'mp3', 'wav', 'ogg' | 'mp3' | The audio format of the response |
Best Practices
When using advanced parameters:
- Start with defaults - Begin with default values and adjust incrementally.
- Test extensively - Different voices and texts may respond differently to the same parameters.
- Consider your use case - Choose parameters that match your application needs:
- For narration, use slower speeds and neutral tones
- For alerts, use higher pitches and faster speeds
- For conversational interfaces, use natural-sounding parameters
- Check compatibility - Not all parameters work with all voices or models.
- Combine parameters carefully - Some parameter combinations might produce unexpected results.
Advanced Examples
Storytelling Voice
// Example: Configure a voice for storytelling
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'Once upon a time, in a land far, far away...',
voice: 'polly_matthew',
model: 'polly_neural',
extended: {
speed: 0.85, // Slightly slower for storytelling
pitch: -1.0 // Slightly deeper voice
},
model_specific: {
engine: 'neural'
},
output_format: 'mp3'
})
});
Alert or Notification Voice
// Example: Configure a voice for alerts
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'Alert! System maintenance will begin in 5 minutes.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
speed: 1.1, // Slightly faster
pitch: 1.5 // Higher pitch for attention
},
model_specific: {
engine: 'neural'
},
output_format: 'mp3'
})
});
Parameter Validation
The API performs validation on all parameters:
- Invalid parameter values will result in a validation error
- Unknown parameters will be ignored
- If a parameter is not supported by the selected model, it will be ignored
For example, if you provide a style
parameter to a voice that doesn't support styles, the API will ignore that parameter and continue processing.
Conclusion
Advanced parameters provide powerful customization options for your text-to-speech applications. By combining different parameters, you can create unique and engaging voice experiences tailored to your specific needs.
For more information about specific model capabilities and parameters, refer to the API Reference and Voice Selection Guide.