ScrapegraphAI icon

ScrapegraphAI

Turn any webpage into usable data in one shot – ScrapegraphAI explores the website and extracts the content you need.

Actions7

Overview

The Agentic Scraper Execute operation allows users to automate browser interactions on a specified webpage by performing a series of defined steps (such as clicking buttons or filling forms). It can optionally maintain the browser session across multiple requests and leverage AI-powered extraction to interpret and extract structured data from the page content based on user instructions. This node is beneficial for scenarios where complex web automation and data extraction are needed, such as logging into dashboards, navigating multi-step workflows, or extracting specific information that requires interaction with dynamic page elements.

Practical examples:

  • Automating login and navigation through a user dashboard to extract account details.
  • Clicking through a series of filters or menus on an e-commerce site to scrape product data.
  • Extracting structured data like user info, dashboard sections, and remaining credits using AI prompts.

Properties

Name Meaning
URL The target webpage URL to interact with.
Steps A list of browser interaction steps to perform sequentially. Each step describes an action (e.g., click a button). At least one step is required.
Use Session Whether to maintain the browser session across multiple requests, allowing stateful interactions.
Enable AI Extraction Whether to enable AI-powered data extraction from the page after performing the browser steps.
User Prompt Instructions for what data to extract from the page when AI extraction is enabled.
Use Custom Output Schema Whether to define a custom JSON schema to structure the extracted data output.
Output Schema The JSON schema defining the structure and types of the extracted data when using a custom output schema.

Output

The node outputs a JSON object containing the results of the browser interaction and optional AI extraction. The exact structure depends on the response from the external scraping API but generally includes:

  • The raw or processed data extracted from the webpage after performing the specified steps.
  • If AI extraction is enabled, the output may include structured data parsed according to the user prompt and optionally validated against the provided JSON schema.

If binary data were involved (not indicated here), it would represent files or media scraped from the page, but this node focuses on JSON data extraction.

Dependencies

  • Requires an active connection to the ScrapegraphAI API service.
  • Needs an API key credential configured in n8n for authentication with the ScrapegraphAI API.
  • Internet access to reach the target URLs and the ScrapegraphAI endpoints.

Troubleshooting

  • Error: "At least one browser interaction step is required"
    Occurs if no valid steps are provided. Ensure you add at least one non-empty action step describing the browser interaction.

  • Error: "Invalid JSON in Output Schema"
    Happens if the custom output schema JSON is malformed. Validate your JSON syntax before inputting it.

  • Network or Authentication Errors
    Verify that the API key credential is correctly set up and has proper permissions. Check network connectivity to the ScrapegraphAI API endpoint.

  • Unexpected or empty output
    Confirm that the URL is correct and accessible, and that the steps accurately reflect the necessary browser actions to reach the desired data.

Links and References

Discussion