ElevenLabs

WIP

Actions13

Speech Actions
- Text to Speech
- Speech to Speech
Voice Actions
History Actions
User Actions
- Get User Info
- Get User Subscription

Overview

The Speech to Speech operation in the "ElevenLabs" n8n node enables users to convert input speech into synthesized speech using a selected voice. This is particularly useful for applications such as voice cloning, automated voice responses, or transforming spoken content into another voice style or language. Common scenarios include creating dynamic audio content, automating customer support with personalized voices, or generating voiceovers for media.

Practical examples:

Converting a recorded message into a professional-sounding voice for podcasts.
Generating multilingual voice responses in customer service bots.
Cloning a specific voice for accessibility tools.

Properties

Below are the supported input properties for the Speech to Speech operation:

Display Name	Type	Meaning
Notice	notice	Informational message about the node's beta status and links for more info/support.
Voice ID	resourceLocator	Specifies the target voice for synthesis. Can be chosen from a list or entered manually by ID.
Additional Fields	collection	Optional settings to customize output and processing (see below).

Additional Fields options:

Binary Name (string): Sets the name of the binary output property (default: data).
File Name (string): Sets the output file name (default: voice).
Streaming Latency (number): Adjusts latency optimizations (0–4); higher values may reduce quality.
Output Format (string): Specifies the audio format (e.g., mp3_44100_128).
Model Name or ID (options): Selects the model used for synthesis.
Stability (number): Controls voice stability (0–1).
Similarity Boost (number): Adjusts similarity to the original voice (0–1).
Stitching (boolean): Enables or disables audio stitching (default: true).
Style (number): Exaggerates the voice style (0–1).
Speaker Boost (boolean): Activates speaker boost (default: false).
Seed (number): Sets a fixed seed for reproducibility.

Output

The node outputs a json object containing metadata about the generated speech.
If binary data is produced (audio file), it will be available under the specified binary property name (default: data). This binary field contains the synthesized audio in the selected format (e.g., MP3).

Dependencies

External Service: Requires access to the ElevenLabs API.
API Key: You must configure the elevenLabsApi credential in n8n.
Environment: Internet connectivity is required for API requests.

Troubleshooting

Common issues:

Invalid API Key: Ensure your ElevenLabs API credentials are correctly set up in n8n.
Voice ID not found: Double-check the Voice ID or select from the provided list.
Audio not generated: Verify that all required fields are filled and the input speech is valid.
Unsupported output format: Make sure the selected output format is supported by ElevenLabs.

Error messages and resolutions:

401 Unauthorized: Check your API key configuration.
404 Voice Not Found: Confirm the Voice ID exists and is accessible.
400 Bad Request: Review input parameters for missing or invalid values.