Advanced Parameters Guide

This guide provides detailed information about the advanced parameters available in the Uberduck API text-to-speech endpoint. These parameters allow you to fine-tune the speech output to meet your specific requirements.

Parameter Categories

The Uberduck API organizes parameters into three categories:

Core Parameters - Essential parameters that are required by all text-to-speech requests
Extended Parameters - Common parameters supported by many different models
Model-Specific Parameters - Parameters that are unique to specific models or providers

Core Parameters

These parameters are required for all text-to-speech requests:

Parameter	Type	Required	Description
`text`	string	Yes	The text to convert to speech
`voice`	string	Yes	The voice ID to use
`model`	string	No	The model ID to use (if not specified, a default compatible model will be selected)

Extended Parameters

Extended parameters are common across many different models:

Speed Control

// Example: Adjust speech speed
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This text will be spoken at a faster rate.',
    voice: 'polly_joanna',
    model: 'polly_neural',
    extended: {
      speed: 1.5  // 50% faster than normal
    }
  })
});

Parameter	Type	Range	Default	Description
`speed`	float	0.5 - 2.0	1.0	Speech rate multiplier. Values > 1 increase speed, values < 1 decrease speed.

Pitch Adjustment

// Example: Adjust voice pitch
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This text will be spoken with a higher pitch.',
    voice: 'polly_joanna',
    model: 'polly_neural',
    extended: {
      pitch: 1.5  // Higher pitch
    }
  })
});

Parameter	Type	Range	Default	Description
`pitch`	float	-10.0 - 10.0	0.0	Voice pitch adjustment. Positive values increase pitch, negative values decrease pitch.

Emotion Control

Some models support emotional expressions:

// Example: Apply emotion to speech
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This text will be spoken with a happy emotion.',
    voice: 'voice_id',
    model: 'model_with_emotion_support',
    extended: {
      emotion: 'happy'
    }
  })
});

Parameter	Type	Options	Default	Description
`emotion`	string	'happy', 'sad', 'angry', 'neutral', etc.	'neutral'	Emotional tone to apply to the speech

Model-Specific Parameters

Different models and providers support unique parameters:

AWS Polly

// Example: AWS Polly-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This uses AWS Polly-specific parameters.',
    voice: 'polly_joanna',
    model: 'polly_neural',
    model_specific: {
      engine: 'neural',  // 'neural' or 'standard'
      voice_style: 'newscaster'  // For specific voices that support styles
    }
  })
});

Parameter	Type	Options	Default	Description
`engine`	string	'neural', 'standard'	Depends on voice	The Polly engine to use
`voice_style`	string	'newscaster', etc.	-	Style for voices that support it

Google Cloud TTS

// Example: Google Cloud TTS-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This uses Google Cloud TTS-specific parameters.',
    voice: 'google_en-US-Neural2-F',
    model: 'google_neural2',
    model_specific: {
      speaking_rate: 0.85,  // Range: 0.25 to 4.0
      pitch: 2.0,  // Range: -20.0 to 20.0
      volume_gain_db: 3.0  // Volume adjustment in dB
    }
  })
});

Parameter	Type	Range	Default	Description
`speaking_rate`	float	0.25 - 4.0	1.0	Speaking rate
`pitch`	float	-20.0 - 20.0	0.0	Voice pitch
`volume_gain_db`	float	-96.0 - 16.0	0.0	Volume adjustment in dB

Azure Speech Service

// Example: Azure Speech Service-specific parameters
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This uses Azure Speech Service-specific parameters.',
    voice: 'azure_en-US-JennyNeural',
    model: 'azure_neural',
    model_specific: {
      style: 'cheerful',  // Styles vary by voice
      style_degree: 1.5,  // Emphasis of the style (0.5-2.0)
      role: 'YoungAdultFemale'  // Role playing for the voice
    }
  })
});

Parameter	Type	Options/Range	Default	Description
`style`	string	'cheerful', 'sad', etc.	-	Speaking style
`style_degree`	float	0.5 - 2.0	1.0	Intensity of the style
`role`	string	'YoungAdultFemale', etc.	-	Character role

Output Format

You can specify the desired output format for the generated audio:

// Example: Specify output format
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'This will be returned as a WAV file.',
    voice: 'polly_joanna',
    model: 'polly_neural',
    output_format: 'wav'  // Options: 'mp3', 'wav', 'ogg'
  })
});

Parameter	Type	Options	Default	Description
`output_format`	string	'mp3', 'wav', 'ogg'	'mp3'	The audio format of the response

Best Practices

When using advanced parameters:

Start with defaults - Begin with default values and adjust incrementally.
Test extensively - Different voices and texts may respond differently to the same parameters.
Consider your use case - Choose parameters that match your application needs:
- For narration, use slower speeds and neutral tones
- For alerts, use higher pitches and faster speeds
- For conversational interfaces, use natural-sounding parameters
Check compatibility - Not all parameters work with all voices or models.
Combine parameters carefully - Some parameter combinations might produce unexpected results.

Advanced Examples

Storytelling Voice

// Example: Configure a voice for storytelling
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Once upon a time, in a land far, far away...',
    voice: 'polly_matthew',
    model: 'polly_neural',
    extended: {
      speed: 0.85,  // Slightly slower for storytelling
      pitch: -1.0   // Slightly deeper voice
    },
    model_specific: {
      engine: 'neural'
    },
    output_format: 'mp3'
  })
});

Alert or Notification Voice

// Example: Configure a voice for alerts
const response = await fetch('https://api.uberduck.ai/v1/text-to-speech', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    text: 'Alert! System maintenance will begin in 5 minutes.',
    voice: 'polly_joanna',
    model: 'polly_neural',
    extended: {
      speed: 1.1,   // Slightly faster
      pitch: 1.5    // Higher pitch for attention
    },
    model_specific: {
      engine: 'neural'
    },
    output_format: 'mp3'
  })
});

Parameter Validation

The API performs validation on all parameters:

Invalid parameter values will result in a validation error
Unknown parameters will be ignored
If a parameter is not supported by the selected model, it will be ignored

For example, if you provide a style parameter to a voice that doesn't support styles, the API will ignore that parameter and continue processing.

Conclusion

Advanced parameters provide powerful customization options for your text-to-speech applications. By combining different parameters, you can create unique and engaging voice experiences tailored to your specific needs.

For more information about specific model capabilities and parameters, refer to the API Reference and Voice Selection Guide.

Parameter Categories​

Core Parameters​

Extended Parameters​

Speed Control​

Pitch Adjustment​

Emotion Control​

Model-Specific Parameters​

AWS Polly​

Google Cloud TTS​

Azure Speech Service​

Output Format​

Best Practices​

Advanced Examples​

Storytelling Voice​

Alert or Notification Voice​

Parameter Validation​

Conclusion​