PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

The "Split PDF By Text" operation in this node allows users to split a PDF document into multiple smaller PDFs based on the occurrence of specific text within the document. This is useful when you want to divide a large PDF into sections or chapters that start with certain keywords or phrases.

Common scenarios include:

  • Splitting a contract or report into separate files by section headers.
  • Dividing a multi-page invoice batch where each invoice starts with a unique identifier text.
  • Extracting individual pages or page ranges from a PDF based on textual markers.

For example, if you have a PDF containing multiple invoices concatenated together, and each invoice starts with the text "Invoice Number:", you can use this operation to split the PDF at every occurrence of that phrase, creating separate PDF files for each invoice.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF data:
- Base64 String: Provide PDF content as a base64 encoded string.
- Binary Data: Use PDF file from previous nodes.
- URL: Provide URL to PDF file.
Binary Property Name The name of the binary property containing the PDF file (used only if Input Data Type is "Binary Data").
Base64 Content Base64 encoded PDF content (used only if Input Data Type is "Base64 String").
PDF URL URL to the PDF file (used only if Input Data Type is "URL").
Text to Search The text string to search for within the PDF to determine where to split the document.
Split Text Page Defines where to split relative to the page containing the searched text:
- After: Split after the page containing the text.
- Before: Split before the page containing the text.
File Naming File naming convention for the resulting split files:
- Name As Per Order: Files are named according to their order in the split sequence.
- Name As Per Page: Files are named according to the page number.
Output Binary Field Name The name of the binary property where the output PDF files will be stored.
Advanced Options Custom JSON profiles to adjust extra options for the API call. Users can specify additional parameters as per the external API documentation to customize the splitting behavior further.

Output

The node outputs an array of items, each representing one of the split PDF files. Each item contains:

  • A json object with metadata about the split file (such as page range or order).
  • A binary property (name configurable via "Output Binary Field Name") containing the PDF file data of the split segment.

If the input was a PDF, the output will be multiple PDFs corresponding to the segments created by splitting at the specified text occurrences.

Dependencies

  • Requires access to an external PDF processing API service capable of splitting PDFs by text.
  • An API key or authentication token must be configured in n8n to authorize requests to this service.
  • Network access to fetch PDFs if using URL input type.

Troubleshooting

  • Issue: No splits occur even though the text exists in the PDF.
    Cause: The text to search might not exactly match the text in the PDF (case sensitivity, whitespace, or formatting differences).
    Solution: Verify the exact text string and try adjusting it or testing with simpler text.

  • Issue: Node fails with authentication errors.
    Cause: Missing or invalid API credentials.
    Solution: Check that the API key or authentication token is correctly set up in n8n credentials.

  • Issue: PDF URL input fails to download the file.
    Cause: URL is inaccessible, requires authentication, or is incorrect.
    Solution: Confirm the URL is publicly accessible or provide the PDF via binary or base64 input instead.

  • Error messages typically relate to invalid input data, missing required properties, or API errors returned from the external service. Review error details and ensure all required fields are correctly provided.

Links and References

  • PDF4me API Documentation — For advanced profile options and detailed API capabilities related to PDF splitting and other operations.

Discussion