PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The node provides functionality to convert PDF documents into Word format (.docx). It supports multiple input methods for the source PDF, including binary data from a previous node, base64 encoded strings, or a direct URL to the PDF file. The conversion process can be customized with options such as output file naming, quality settings (draft or high quality), OCR language selection for scanned PDFs, and advanced options like retry behavior and OCR usage.

This node is beneficial in scenarios where automated workflows require extracting editable text and formatting from PDFs, such as document editing, content repurposing, or archiving scanned documents in an editable format. For example, it can be used to convert contracts received as PDFs into Word documents for further editing or to extract text from scanned reports using OCR.

Properties

Name	Meaning
Input Data Type	Choose how to provide the PDF file to convert. Options: Binary Data (from previous node), Base64 String (PDF content encoded as base64), URL (link to PDF file).
Input Binary Field	Name of the binary property containing the PDF file when using Binary Data input type (usually "data").
Base64 PDF Content	Base64 encoded string of the PDF document content, used when Input Data Type is Base64 String.
PDF URL	URL pointing to the PDF file to convert, used when Input Data Type is URL.
Output File Name	Desired name for the output Word document file (e.g., "converted_document.docx").
Document Name	Name of the source PDF file for reference purposes (e.g., "original-file.pdf").
Quality Type	Conversion quality setting. Options: Draft (faster, suitable for simple PDFs with clear text), Quality (slower but more accurate, better for complex layouts).
OCR Language	Language used for Optical Character Recognition on images or scanned PDFs. Supported languages include Arabic, Chinese (Simplified/Traditional), Danish, Dutch, English, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Portuguese, Russian, Spanish, Swedish.
Advanced Options	Collection of additional settings: • Custom Profiles: JSON string to adjust custom API properties. • Max Retries: Maximum polling attempts for async processing. • Merge All Sheets: Combine multiple pages into one flow. • Preserve Output Format: Keep original formatting if possible. • Retry Delay (Seconds): Base delay between polling attempts. • Use OCR When Needed: Enable OCR for scanned PDFs.
Binary Data Output Name	Custom name for the binary data field in the node's output (default is "data").

Output

The node outputs the converted Word document as binary data under the specified binary data output name (default "data"). The output includes the Word file content ready for downstream nodes to use, such as saving to disk, sending via email, or further processing. The JSON output may also contain metadata about the conversion process or errors if any occurred.

Dependencies

Requires access to an external PDF-to-Word conversion service API.
Needs proper API authentication configured in n8n credentials (an API key or token).
Network access to fetch PDF files if using URL input type.
Optional OCR capabilities depend on supported languages and service features.

Troubleshooting

Common Issues:
- Incorrect input data type or missing binary property name can cause failures.
- Invalid or inaccessible PDF URLs will result in download errors.
- Large or complex PDFs might require increasing max retries or retry delay.
- OCR may fail if the selected language does not match the document's language.
Error Messages:
- "File not found" or "Unable to download PDF" indicates URL issues.
- "Invalid base64 content" suggests malformed input string.
- "Conversion failed" may indicate unsupported PDF features or service errors.
Resolutions:
- Verify input data matches the selected input type.
- Check network connectivity and URL validity.
- Adjust advanced options like retries and OCR usage.
- Ensure correct OCR language is selected for scanned documents.

Links and References

PDF4me API Documentation
General information on OCR languages and support can be found in the API docs linked above.