PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The node provides a "Parse Document" operation that extracts structured data from documents using a specified parsing configuration. It supports multiple input methods for the document, including binary data from previous nodes, base64-encoded strings, or URLs pointing to the document file. The parsed output can be returned either as JSON data or as a text file.

This node is beneficial in scenarios where automated extraction of information from PDFs or other document formats is needed, such as invoice processing, contract analysis, or form data extraction. For example, a user could upload an invoice PDF and use this node to parse key fields like invoice number, date, and total amount into JSON for further workflow automation.

Properties

Name	Meaning
Input Data Type	Choose how to provide the document to parse. Options: Binary Data (document file from previous node), Base64 String (base64 encoded document content), URL (link to document file).
Input Binary Field	Name of the binary property containing the document file (used only if Input Data Type is Binary Data).
Base64 Document Content	Base64 encoded content of the document (used only if Input Data Type is Base64 String).
Document URL	URL to the document file to parse (used only if Input Data Type is URL).
Document Name	Name of the source document file for reference (e.g., "original-document.pdf").
Parse ID	GUID of the parse configuration to use; also serves as the Template ID for parsing rules.
Output Format	Format for the parsed document output. Options: JSON (parsed data as JSON), Text File (parsed data as a text file).
Output File Name	Name for the output file when Output Format is set to Text File (e.g., "my-parsed-document.txt").
Advanced Options	Collection of additional options. Currently supports "Custom Profiles" where users can specify JSON to adjust custom properties for API calls, e.g., setting extra parsing options according to external profile documentation.
Binary Data Output Name	Custom name for the binary data field in the node's output (default is "data").

Output

json: Contains the parsed document data. If the output format is JSON, this will be structured data extracted from the document according to the parse configuration.
binary: If the output format is a text file, the parsed content is provided as binary data with the specified output file name. This allows downstream nodes to handle the parsed text as a downloadable or storable file.

Dependencies

Requires access to an external document parsing service via API, which uses the provided Parse ID (parse configuration GUID) to interpret the document.
Needs proper authentication credentials configured in n8n to connect to the parsing API.
Network access to fetch documents if using URL input type.

Troubleshooting

Common issues:
- Incorrect or missing Parse ID may cause parsing failures or unexpected results.
- Providing an invalid or inaccessible URL will result in errors fetching the document.
- Mismatch between Input Data Type and provided data (e.g., selecting Binary Data but no binary input present) will cause errors.
- Output format mismatch or incorrect output file naming might lead to confusion in downstream processing.
Error messages:
- Errors related to document retrieval (e.g., network errors, 404 not found) indicate problems accessing the document URL.
- Parsing errors often relate to invalid parse configurations or unsupported document formats.
- Authentication errors suggest missing or invalid API credentials.
Resolutions:
- Verify the Parse ID is correct and corresponds to a valid parsing template.
- Ensure the document URL is accessible and correct.
- Confirm the input data matches the selected Input Data Type.
- Check API credentials and permissions in n8n settings.

Links and References

PDF4me Developer Profiles Documentation — for configuring custom profiles in advanced options.
General API documentation for the external parsing service (not included here, but typically available from the service provider).