Zalo User Interact icon

Zalo User Interact

Gửi tin nhắn và tương tác với Zalo User

Overview

This node provides a Text-to-Speech (TTS) functionality, converting input text into spoken audio using selectable voice parameters. It is useful in scenarios where automated speech generation is needed, such as creating voice messages, accessibility features, or audio content generation.

Typical use cases include:

  • Generating audio announcements from text.
  • Creating voiceovers for videos or presentations.
  • Automating customer service responses with natural-sounding voices.
  • Accessibility tools that read text aloud.

For example, a user can input a Vietnamese sentence and select a Vietnamese male neural voice to generate an audio file of the spoken text, adjusting speed, pitch, and volume to suit their needs.

Properties

Name Meaning
Text The text string to be converted into speech.
Voice The voice used for TTS. Options are dynamically loaded and represent different voice models (e.g., "vi-VN-NamMinhNeural").
Rate Speech rate adjustment, e.g., "0%" means normal speed; positive or negative percentages speed up or slow down the speech.
Volume Volume adjustment of the speech output, e.g., "0%" is default volume; can be increased or decreased.
Pitch Pitch adjustment of the voice, e.g., "0Hz" is default pitch; can be raised or lowered.

Output

The node outputs JSON data containing the generated speech audio. The key output is a binary field representing the audio file of the synthesized speech. This binary data can be saved or passed to other nodes for further processing, such as uploading to storage or playing back.

The output structure includes:

  • json: Metadata about the operation result.
  • binary: Contains the audio data of the generated speech, typically in a standard audio format (e.g., WAV or MP3).

Dependencies

  • Requires access to an external Text-to-Speech API service that supports multiple voices and speech parameter adjustments.
  • Needs an API authentication token or API key credential configured in n8n to authorize requests to the TTS service.
  • Uses Node.js modules for HTTP requests and buffer handling.
  • May require filesystem access if temporary files are created during processing.

Troubleshooting

  • Common Issues:

    • Invalid or missing API credentials will cause authentication errors.
    • Unsupported voice names or invalid parameter formats may lead to request failures.
    • Network issues can prevent communication with the TTS service.
    • Large text inputs might exceed API limits or cause timeouts.
  • Error Messages:

    • Authentication errors: Check that the API key or token is correctly set up in n8n credentials.
    • Parameter validation errors: Ensure that text is provided and voice/rate/volume/pitch values conform to expected formats.
    • Timeout or network errors: Verify internet connectivity and API endpoint availability.
  • Resolution Tips:

    • Validate all input properties before execution.
    • Use smaller text chunks if large texts fail.
    • Confirm voice options by loading available voices via the node's voice selection method.
    • Review API documentation for limits and supported parameters.

Links and References


Note: The source code was heavily obfuscated, but the core logic and property usage were extracted based on static analysis and the provided property definitions.

Discussion