ElevenLabs icon

ElevenLabs

WIP

Overview

The Speech to Speech operation in the "ElevenLabs" n8n node enables users to convert input speech into synthesized speech using a selected voice. This is particularly useful for applications such as voice cloning, automated voice responses, or transforming spoken content into another voice style or language. Common scenarios include creating dynamic audio content, automating customer support with personalized voices, or generating voiceovers for media.

Practical examples:

  • Converting a recorded message into a professional-sounding voice for podcasts.
  • Generating multilingual voice responses in customer service bots.
  • Cloning a specific voice for accessibility tools.

Properties

Below are the supported input properties for the Speech to Speech operation:

Display Name Type Meaning
Notice notice Informational message about the node's beta status and links for more info/support.
Voice ID resourceLocator Specifies the target voice for synthesis. Can be chosen from a list or entered manually by ID.
Additional Fields collection Optional settings to customize output and processing (see below).

Additional Fields options:

  • Binary Name (string): Sets the name of the binary output property (default: data).
  • File Name (string): Sets the output file name (default: voice).
  • Streaming Latency (number): Adjusts latency optimizations (0–4); higher values may reduce quality.
  • Output Format (string): Specifies the audio format (e.g., mp3_44100_128).
  • Model Name or ID (options): Selects the model used for synthesis.
  • Stability (number): Controls voice stability (0–1).
  • Similarity Boost (number): Adjusts similarity to the original voice (0–1).
  • Stitching (boolean): Enables or disables audio stitching (default: true).
  • Style (number): Exaggerates the voice style (0–1).
  • Speaker Boost (boolean): Activates speaker boost (default: false).
  • Seed (number): Sets a fixed seed for reproducibility.

Output

  • The node outputs a json object containing metadata about the generated speech.
  • If binary data is produced (audio file), it will be available under the specified binary property name (default: data). This binary field contains the synthesized audio in the selected format (e.g., MP3).

Dependencies

  • External Service: Requires access to the ElevenLabs API.
  • API Key: You must configure the elevenLabsApi credential in n8n.
  • Environment: Internet connectivity is required for API requests.

Troubleshooting

Common issues:

  • Invalid API Key: Ensure your ElevenLabs API credentials are correctly set up in n8n.
  • Voice ID not found: Double-check the Voice ID or select from the provided list.
  • Audio not generated: Verify that all required fields are filled and the input speech is valid.
  • Unsupported output format: Make sure the selected output format is supported by ElevenLabs.

Error messages and resolutions:

  • 401 Unauthorized: Check your API key configuration.
  • 404 Voice Not Found: Confirm the Voice ID exists and is accessible.
  • 400 Bad Request: Review input parameters for missing or invalid values.

Links and References

Discussion