PDF4me icon

PDF4me

Comprehensive PDF and document processing: generate barcodes, convert files, extract data, manipulate images, and automate workflows with the PDF4ME API

Actions80

Overview

This node operation, Extract Attachment From PDF, is designed to extract embedded attachments from a PDF document. It supports multiple input methods for providing the PDF file: as binary data from a previous node, as a base64 encoded string, or via a URL pointing to the PDF file.

Common scenarios where this node is beneficial include:

  • Automatically retrieving and processing embedded files within PDFs, such as images, documents, or other resources attached inside a PDF portfolio.
  • Extracting attachments for further automation workflows like saving them to cloud storage, analyzing their content, or forwarding them via email.
  • Handling invoices, contracts, or reports that contain supplementary files embedded as attachments.

Practical example:

  • A user receives a batch of PDFs containing embedded spreadsheets as attachments. This node can extract those spreadsheets automatically so they can be processed or imported into other systems.

Properties

Name Meaning
Input Data Type Choose how to provide the PDF file to extract attachments from. Options:
• Binary Data (from previous node)
• Base64 String (provide PDF content as base64 encoded string)
• URL (provide URL to PDF file)
Input Binary Field Name of the binary property that contains the PDF file when using Binary Data input type. Usually "data" for file uploads.
Base64 PDF Content Base64 encoded PDF document content. Required if Input Data Type is set to Base64 String.
PDF URL URL to the PDF file to extract attachments from. Required if Input Data Type is set to URL.
Document Name Name of the document used for processing. Defaults to "document.pdf".
Advanced Options Collection of advanced options. Currently supports:
• Custom Profiles: JSON string to adjust custom properties for API calls, allowing extra options specific to certain APIs. See https://dev.pdf4me.com/apiv2/documentation/

Output

The output is an array of items where each item contains a json field with the extracted attachment data from the PDF. The structure typically includes metadata about each attachment and the attachment content itself, which may be provided as binary data.

If the node outputs binary data, it represents the actual extracted attachment files from the PDF, ready for further processing or saving.

Dependencies

  • Requires access to the PDF processing API service (likely PDF4me API) to perform extraction.
  • Needs proper API authentication configured in n8n credentials (an API key or token).
  • Internet access if using the URL input method to fetch the PDF file.

Troubleshooting

  • Common issues:
    • Providing incorrect or inaccessible URLs will cause failures in fetching the PDF.
    • Incorrect base64 strings or corrupted binary data will result in errors during extraction.
    • Missing or invalid API credentials will prevent successful API calls.
  • Error messages:
    • Errors related to file not found or inaccessible URL: Verify the URL is correct and publicly accessible.
    • Invalid PDF format or corrupted file errors: Ensure the input PDF is valid and not damaged.
    • Authentication errors: Check API key/token validity and permissions.
  • To resolve, verify inputs, ensure credentials are correctly set up, and confirm network connectivity.

Links and References

Discussion