PDF-LIB

Perform operations on PDF files (get info, split)

Actions2

- Get PDF Info
- Split PDF

Overview

This node performs operations on PDF files, specifically extracting information and splitting PDFs into smaller chunks. It is useful when you need to analyze PDF documents or divide large PDFs into manageable parts for further processing or distribution.

Common scenarios include:

Extracting the total number of pages from a PDF to decide subsequent workflow steps.
Splitting a large PDF into smaller files, each containing a specified number of pages, for easier handling or sending via email.

For example, you might use this node to split a 100-page report into 10 separate PDFs with 10 pages each, or to get the page count of an uploaded PDF before deciding whether to archive or process it.

Properties

Name	Meaning
Binary Property	The name of the binary property that contains the PDF file to be processed.
Chunk Size	Number of pages per chunk when splitting the PDF. Only applicable for the "Split PDF" operation.

Output

The output JSON structure depends on the selected operation:

Get PDF Info:
- pageCount: Number of pages in the PDF.
- operation: The string "getInfo".
- fileName: Original file name of the PDF or "unknown.pdf" if not available.
Split PDF:
- count: Number of PDF chunks created.
- pageRanges: Array of strings indicating the page ranges for each chunk (e.g., "1-5", "6-10").
- operation: The string "split".
- originalFileName: Original file name of the PDF or "unknown.pdf" if not available.

Additionally, for the split operation, the node outputs binary data for each chunk under keys like pdf1, pdf2, etc. Each binary entry includes:

data: Base64 encoded PDF chunk.
fileName: Generated file name such as split_1.pdf.
mimeType: Always "application/pdf".

Dependencies

Uses the pdf-lib library (bundled internally) to load, read, and manipulate PDF files.
Reads PDF files either from the local filesystem path indicated by the binary data metadata or directly from the binary data buffer.
Requires the input item to contain binary data with the PDF file under the specified binary property.

Troubleshooting

No binary data property found: If the specified binary property does not exist on the input item, the node will throw an error. Ensure the binary property name matches exactly the one containing the PDF.
Failed to load PDF: The node attempts to load the PDF first from the filesystem path and then from the binary buffer. Errors here may indicate corrupted PDF data or incorrect binary property configuration.
Chunk size issues: Setting a chunk size less than 1 or larger than the total page count may cause unexpected results. Use sensible chunk sizes relative to the PDF length.
Continue on Fail: If enabled, errors for individual items will be returned in the output JSON under an error field instead of stopping execution.

Links and References

pdf-lib GitHub repository — The underlying library used for PDF manipulation.
n8n Documentation — For general guidance on working with binary data and custom nodes.