Skip to main content

Advanced Parameters Guide

This guide provides detailed information about the advanced parameters available in the Uberduck API text-to-speech endpoint. These parameters allow you to fine-tune the speech output to meet your specific requirements.

Parameter Categories

The Uberduck API organizes parameters into three categories:

  1. Core Parameters - Essential parameters that are required by all text-to-speech requests
  2. Extended Parameters - Common parameters supported by many different models
  3. Model-Specific Parameters - Parameters that are unique to specific models or providers

Core Parameters

These parameters are required for all text-to-speech requests:

ParameterTypeRequiredDescription
textstringYesThe text to convert to speech
voicestringYesThe voice ID to use
modelstringNoThe model ID to use (if not specified, a default compatible model will be selected)

Extended Parameters

Extended parameters are common across many different models:

Speed Control

// Example: Adjust speech speed
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken at a faster rate.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
speed: 1.5 // 50% faster than normal
}
})
});
ParameterTypeRangeDefaultDescription
speedfloat0.5 - 2.01.0Speech rate multiplier. Values > 1 increase speed, values < 1 decrease speed.

Pitch Adjustment

// Example: Adjust voice pitch
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken with a higher pitch.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
pitch: 1.5 // Higher pitch
}
})
});
ParameterTypeRangeDefaultDescription
pitchfloat-10.0 - 10.00.0Voice pitch adjustment. Positive values increase pitch, negative values decrease pitch.

Emotion Control

Some models support emotional expressions:

// Example: Apply emotion to speech
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This text will be spoken with a happy emotion.',
voice: 'voice_id',
model: 'model_with_emotion_support',
extended: {
emotion: 'happy'
}
})
});
ParameterTypeOptionsDefaultDescription
emotionstring'happy', 'sad', 'angry', 'neutral', etc.'neutral'Emotional tone to apply to the speech

Model-Specific Parameters

Different models and providers support unique parameters:

AWS Polly

// Example: AWS Polly-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses AWS Polly-specific parameters.',
voice: 'polly_joanna',
model: 'polly_neural',
model_specific: {
engine: 'neural', // 'neural' or 'standard'
voice_style: 'newscaster' // For specific voices that support styles
}
})
});
ParameterTypeOptionsDefaultDescription
enginestring'neural', 'standard'Depends on voiceThe Polly engine to use
voice_stylestring'newscaster', etc.-Style for voices that support it

Google Cloud TTS

// Example: Google Cloud TTS-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses Google Cloud TTS-specific parameters.',
voice: 'google_en-US-Neural2-F',
model: 'google_neural2',
model_specific: {
speaking_rate: 0.85, // Range: 0.25 to 4.0
pitch: 2.0, // Range: -20.0 to 20.0
volume_gain_db: 3.0 // Volume adjustment in dB
}
})
});
ParameterTypeRangeDefaultDescription
speaking_ratefloat0.25 - 4.01.0Speaking rate
pitchfloat-20.0 - 20.00.0Voice pitch
volume_gain_dbfloat-96.0 - 16.00.0Volume adjustment in dB

Azure Speech Service

// Example: Azure Speech Service-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This uses Azure Speech Service-specific parameters.',
voice: 'azure_en-US-JennyNeural',
model: 'azure_neural',
model_specific: {
style: 'cheerful', // Styles vary by voice
style_degree: 1.5, // Emphasis of the style (0.5-2.0)
role: 'YoungAdultFemale' // Role playing for the voice
}
})
});
ParameterTypeOptions/RangeDefaultDescription
stylestring'cheerful', 'sad', etc.-Speaking style
style_degreefloat0.5 - 2.01.0Intensity of the style
rolestring'YoungAdultFemale', etc.-Character role

Output Format

You can specify the desired output format for the generated audio:

// Example: Specify output format
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'This will be returned as a WAV file.',
voice: 'polly_joanna',
model: 'polly_neural',
output_format: 'wav' // Options: 'mp3', 'wav', 'ogg'
})
});
ParameterTypeOptionsDefaultDescription
output_formatstring'mp3', 'wav', 'ogg''mp3'The audio format of the response

Best Practices

When using advanced parameters:

  1. Start with defaults - Begin with default values and adjust incrementally.
  2. Test extensively - Different voices and texts may respond differently to the same parameters.
  3. Consider your use case - Choose parameters that match your application needs:
    • For narration, use slower speeds and neutral tones
    • For alerts, use higher pitches and faster speeds
    • For conversational interfaces, use natural-sounding parameters
  4. Check compatibility - Not all parameters work with all voices or models.
  5. Combine parameters carefully - Some parameter combinations might produce unexpected results.

Advanced Examples

Storytelling Voice

// Example: Configure a voice for storytelling
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'Once upon a time, in a land far, far away...',
voice: 'polly_matthew',
model: 'polly_neural',
extended: {
speed: 0.85, // Slightly slower for storytelling
pitch: -1.0 // Slightly deeper voice
},
model_specific: {
engine: 'neural'
},
output_format: 'mp3'
})
});

Alert or Notification Voice

// Example: Configure a voice for alerts
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
text: 'Alert! System maintenance will begin in 5 minutes.',
voice: 'polly_joanna',
model: 'polly_neural',
extended: {
speed: 1.1, // Slightly faster
pitch: 1.5 // Higher pitch for attention
},
model_specific: {
engine: 'neural'
},
output_format: 'mp3'
})
});

Parameter Validation

The API performs validation on all parameters:

  • Invalid parameter values will result in a validation error
  • Unknown parameters will be ignored
  • If a parameter is not supported by the selected model, it will be ignored

For example, if you provide a style parameter to a voice that doesn't support styles, the API will ignore that parameter and continue processing.

Conclusion

Advanced parameters provide powerful customization options for your text-to-speech applications. By combining different parameters, you can create unique and engaging voice experiences tailored to your specific needs.

For more information about specific model capabilities and parameters, refer to the API Reference and Voice Selection Guide.