Actions80
- Add Attachment To PDF
- Add Barcode To PDF
- Add Form Fields To PDF
- Add HTML Header Footer
- Add Image Stamp To PDF
- Add Image Watermark To Image
- Add Margin To PDF
- Add Page Number To PDF
- Add Text Stamp To PDF
- Add Text Watermark To Image
- AI-Invoice Parser
- AI-Process Contract
- AI-Process HealthCard
- Classify Document
- Compress Image
- Compress PDF
- Convert HTML To PDF
- Convert Image Format
- Convert JSON To Excel
- Convert Markdown To PDF
- Convert PDF To Editable PDF Using OCR
- Convert PDF To Excel
- Convert PDF To PowerPoint
- Convert PDF To Word
- Convert To PDF
- Convert URL to PDF
- Convert VISIO
- Convert Word to PDF Form
- Create Images From PDF
- Create PDF/A
- Create Swiss QR Bill
- Crop Image
- Delete Blank Pages From PDF
- Delete Unwanted Pages From PDF
- Split PDF By Barcode
- Disable Tracking Changes In Word
- Enable Tracking Changes In Word
- Extract Attachment From PDF
- Extract Form Data From PDF
- Extract Pages From PDF
- Extract Resources
- Extract Table From PDF
- Extract Text By Expression
- Extract Text From Word
- Fill PDF Form
- Find And Replace Text
- Flip Image
- Flatten PDF
- Generate Barcode
- Generate Document Single
- Generate Documents Multiple
- Get Document From Pdf4me
- Get Image Metadata
- Get PDF Metadata
- Split PDF By Swiss QR
- Get Tracking Changes In Word
- Image Extract Text
- Linearize PDF
- Merge Multiple PDFs
- Overlay PDFs
- Parse Document
- Protect PDF
- Read Barcode From Image
- Read Barcode From PDF
- Read SwissQR Code
- Remove EXIF Tags From Image
- Repair PDF Document
- Replace Text With Image
- Replace Text With Image In Word
- Resize Image
- Rotate Document
- Rotate Image
- Rotate Image By EXIF Data
- Rotate PDF Page
- Sign PDF
- Split PDF By Text
- Split PDF Regular
- Unlock PDF
- Update Hyperlinks Annotation
- Upload File To PDF4me
Overview
This node extracts tables from PDF documents. It is useful when you need to programmatically retrieve tabular data embedded in PDFs for further processing, such as data analysis, reporting, or integration with other systems.
Common scenarios include:
- Extracting invoice line items from PDF invoices.
- Retrieving structured data from reports or forms saved as PDFs.
- Automating data entry by converting PDF tables into JSON or spreadsheet formats.
For example, you can provide a PDF file containing financial statements and extract all tables within it to convert them into structured JSON data for automated accounting workflows.
Properties
| Name | Meaning |
|---|---|
| Input Data Type | Choose how to provide the PDF file: • Binary Data (from previous node) • Base64 String (directly input base64 encoded PDF content) • URL (link to the PDF file) |
| Input Binary Field | Name of the binary property containing the PDF file (default "data"). Used only if Input Data Type is Binary Data. |
| Base64 PDF Content | Base64 encoded string of the PDF document content. Used only if Input Data Type is Base64 String. |
| PDF URL | URL pointing to the PDF file to extract tables from. Used only if Input Data Type is URL. |
| Document Name | Name assigned to the document during processing (default "document.pdf"). Useful for identification or logging purposes. |
| Advanced Options | Optional JSON string to specify custom profiles or additional API options. For example, setting output data format or other extraction parameters as per the external API documentation. |
Output
The node outputs an array of JSON objects representing the extracted tables from the PDF. Each object typically contains structured data corresponding to one table found in the document.
If the PDF contains multiple tables, each will be represented separately in the output array.
The output json field includes the parsed table data suitable for downstream automation, such as conversion to spreadsheets or database insertion.
No binary data output is produced by this operation.
Dependencies
- Requires access to an external PDF processing API service capable of extracting tables from PDFs.
- The node expects proper authentication credentials (e.g., an API key) configured in n8n to communicate with the external service.
- Network access is needed if providing PDF via URL or when the node fetches the PDF content from external sources.
Troubleshooting
Common issues:
- Providing an invalid or inaccessible PDF URL may cause failures.
- Incorrect base64 encoding of the PDF content will result in errors.
- Missing or incorrect binary property name when using binary data input.
- API authentication failures due to missing or invalid credentials.
Error messages:
- Errors related to "file not found" or "unable to download PDF" indicate problems with the URL or network.
- "Invalid PDF format" suggests corrupted or unsupported PDF files.
- Authentication errors require checking API key configuration.
Resolutions:
- Verify the PDF source and ensure accessibility.
- Confirm base64 strings are correctly encoded.
- Double-check the binary property name matches the actual input data.
- Ensure API credentials are set up correctly in n8n.
Links and References
- PDF4me API Documentation — for advanced profile options and API capabilities related to PDF processing and table extraction.