Actions80
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Image Watermark To Image
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- Add Text Watermark To Image
- AI-Invoice Parser
- AI-Process Contract
- AI-Process HealthCard
- Classify Document
- Compress Image
- Compress PDF
- Convert HTML To PDF
- Convert Image Format
- Convert JSON To Excel
- Convert Markdown To PDF
- Convert PDF To Editable PDF Using OCR
- Convert PDF To Excel
- Convert PDF To PowerPoint
- Convert PDF To Word
- Convert To PDF
- Convert URL to PDF
- Convert VISIO
- Convert Word to PDF Form
- Create Images From PDF
- Create PDF/A
- Create Swiss QR Bill
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Split PDF By Barcode
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Extract Attachment From PDF
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Resources
- Extract Table From PDF
- Extract Text By Expression
- Extract Text From Word
- Fill PDF Form
- Find And Replace Text
- Flip Image
- Flatten PDF
- Generate Barcode
- Generate Document Single
- Generate Documents Multiple
- Get Document From Pdf4me
- Get Image Metadata
- Get PDF Metadata
- Split PDF By Swiss QR
- Get Tracking Changes In Word
- Image Extract Text
- Linearize PDF
- Merge Multiple PDFs
- Overlay PDFs
- Parse Document
- Protect PDF
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Remove EXIF Tags From Image
- Repair PDF Document
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Document
- Rotate Image
- Rotate Image By EXIF Data
- Rotate PDF Page
- Sign PDF
- Split PDF By Text
- Split PDF Regular
- Unlock PDF
- Update Hyperlinks Annotation
- Upload File To PDF4me
Overview
The node provides functionality to convert PDF documents into Word format (.docx). It supports multiple input methods for the source PDF, including binary data from a previous node, base64 encoded strings, or a direct URL to the PDF file. The conversion process can be customized with options such as output file naming, quality settings (draft or high quality), OCR language selection for scanned PDFs, and advanced options like retry behavior and OCR usage.
This node is beneficial in scenarios where automated workflows require extracting editable text and formatting from PDFs, such as document editing, content repurposing, or archiving scanned documents in an editable format. For example, it can be used to convert contracts received as PDFs into Word documents for further editing or to extract text from scanned reports using OCR.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file to convert. Options: Binary Data (from previous node), Base64 String (PDF content encoded as base64), URL (link to PDF file). |
| Input Binary Field | Name of the binary property containing the PDF file when using Binary Data input type (usually "data"). |
| Base64 PDF Content | Base64 encoded string of the PDF document content, used when Input Data Type is Base64 String. |
| PDF URL | URL pointing to the PDF file to convert, used when Input Data Type is URL. |
| Output File Name | Desired name for the output Word document file (e.g., "converted_document.docx"). |
| Document Name | Name of the source PDF file for reference purposes (e.g., "original-file.pdf"). |
| Quality Type | Conversion quality setting. Options: Draft (faster, suitable for simple PDFs with clear text), Quality (slower but more accurate, better for complex layouts). |
| OCR Language | Language used for Optical Character Recognition on images or scanned PDFs. Supported languages include Arabic, Chinese (Simplified/Traditional), Danish, Dutch, English, Finnish, French, German, Italian, Japanese, Korean, Norwegian, Portuguese, Russian, Spanish, Swedish. |
| Advanced Options | Collection of additional settings: • Custom Profiles: JSON string to adjust custom API properties. • Max Retries: Maximum polling attempts for async processing. • Merge All Sheets: Combine multiple pages into one flow. • Preserve Output Format: Keep original formatting if possible. • Retry Delay (Seconds): Base delay between polling attempts. • Use OCR When Needed: Enable OCR for scanned PDFs. |
| Binary Data Output Name | Custom name for the binary data field in the node's output (default is "data"). |
Output
The node outputs the converted Word document as binary data under the specified binary data output name (default "data"). The output includes the Word file content ready for downstream nodes to use, such as saving to disk, sending via email, or further processing. The JSON output may also contain metadata about the conversion process or errors if any occurred.
Dependencies
- Requires access to an external PDF-to-Word conversion service API.
- Needs proper API authentication configured in n8n credentials (an API key or token).
- Network access to fetch PDF files if using URL input type.
- Optional OCR capabilities depend on supported languages and service features.
Troubleshooting
- Common Issues:
- Incorrect input data type or missing binary property name can cause failures.
- Invalid or inaccessible PDF URLs will result in download errors.
- Large or complex PDFs might require increasing max retries or retry delay.
- OCR may fail if the selected language does not match the document's language.
- Error Messages:
- "File not found" or "Unable to download PDF" indicates URL issues.
- "Invalid base64 content" suggests malformed input string.
- "Conversion failed" may indicate unsupported PDF features or service errors.
- Resolutions:
- Verify input data matches the selected input type.
- Check network connectivity and URL validity.
- Adjust advanced options like retries and OCR usage.
- Ensure correct OCR language is selected for scanned documents.
Links and References
- PDF4me API Documentation
- General information on OCR languages and support can be found in the API docs linked above.