PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The "Extract Pages From PDF" operation in this node allows users to extract specific pages from a PDF document and create a new PDF containing only those pages. This is useful when you need to isolate certain parts of a large PDF, such as extracting chapters, sections, or relevant pages for sharing, archiving, or further processing.

Common scenarios include:

  • Extracting invoice pages from a multi-page PDF report.
  • Creating a summary document by selecting key pages.
  • Splitting a large PDF into smaller documents based on page ranges.

Users can provide the source PDF in multiple ways: as binary data from a previous node, as a base64 encoded string, or via a URL pointing to the PDF file.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file to extract pages from. Options:
• Binary Data (from previous node)
• Base64 String (base64 encoded PDF content)
• URL (link to PDF file)
Input Binary Field Name of the binary property that contains the PDF file (usually "data" for file uploads). Required if Input Data Type is Binary Data.
Base64 PDF Content Base64 encoded PDF document content. Required if Input Data Type is Base64 String.
PDF URL URL to the PDF file to extract pages from. Required if Input Data Type is URL.
Document Name Name of the output PDF document after extraction. Defaults to "output.pdf".
Page Numbers Page numbers to extract from the PDF. Supports single pages (e.g., "1"), multiple pages separated by commas (e.g., "1,3,5"), or ranges (e.g., "2-4").
Output Binary Field Name Name of the binary property where the output PDF file will be stored. Defaults to "data".
Advanced Options Optional JSON string to specify custom profiles or additional API options for the extraction process. Useful for advanced users who want to customize the behavior according to external API documentation.

Output

The node outputs the extracted pages as a new PDF file in binary format. The output is stored in the specified binary property (default "data") with the filename set to the provided Document Name (default "output.pdf").

The json output field typically contains metadata about the operation or the file, but the main content is the binary PDF data representing the extracted pages.

Dependencies

  • Requires access to an external PDF processing API service (implied by the bundled code referencing many PDF-related actions).
  • Users must configure appropriate API credentials or authentication tokens within n8n to enable communication with the PDF processing service.
  • Internet access may be required if providing the PDF via URL or if the API is cloud-based.

Troubleshooting

  • Invalid Page Numbers: If the page numbers string is malformed or references pages outside the PDF's range, the operation may fail. Ensure page numbers are correctly formatted and valid.
  • Missing Input Data: Providing incorrect input data type or missing the corresponding input field (binary property, base64 content, or URL) will cause errors. Verify that the input matches the selected Input Data Type.
  • API Authentication Errors: Failure to authenticate with the external PDF service will result in errors. Check API keys or tokens and ensure they are correctly configured in n8n.
  • Network Issues: When using URLs or cloud APIs, network connectivity problems can cause failures. Confirm internet access and URL validity.
  • Output Binary Field Conflicts: If the output binary field name conflicts with existing fields, it might overwrite data unintentionally. Use unique names if necessary.

Links and References

Discussion