Actions7
Overview
The node "ScrapegraphAI" provides a powerful interface to extract usable data from webpages by leveraging the ScrapegraphAI API. It supports multiple scraping-related resources and operations, including a general "Scrape" operation that fetches content from a specified website URL.
For the Scrape resource with the Scrape operation, the node sends a request to the ScrapegraphAI service to retrieve structured data from any given webpage. This is useful when you want to quickly extract information from websites without manually parsing HTML or dealing with complex JavaScript rendering.
Common scenarios:
- Extracting product details from e-commerce pages.
- Gathering article content or metadata from news sites.
- Collecting contact information or listings from business directories.
- Automating data collection for research or monitoring purposes.
Practical example:
You provide the URL of a product page on an online store and enable the option to render heavy JavaScript if the page relies on client-side rendering. The node then returns the extracted product information in JSON format, ready for further processing or storage.
Properties
| Name | Meaning |
|---|---|
| Website URL | The URL of the website page you want to scrape data from. |
| Render Heavy JS | Boolean flag indicating whether to render heavy JavaScript content on the page before scraping (useful for dynamic sites). |
Output
The output is a JSON object containing the scraped data returned by the ScrapegraphAI API for the requested webpage. The structure depends on the content of the target page and how the API interprets it but generally includes the main extracted information in a structured form.
If the page requires rendering of heavy JavaScript to fully load content, enabling the Render Heavy JS option ensures the node waits for the page to be fully rendered before extraction.
No binary data output is produced by this operation.
Dependencies
- Requires an active connection to the ScrapegraphAI API.
- Requires an API key credential configured in n8n for authenticating requests to ScrapegraphAI.
- Internet access from the n8n instance to reach the external ScrapegraphAI service endpoint.
Troubleshooting
Invalid JSON in Output Schema error:
If you use an output schema parameter (not applicable here but present in other operations), ensure the JSON is valid. Invalid JSON will cause the node to throw an error.Network or authentication errors:
Ensure your API key credential is correctly set up and has proper permissions. Also verify network connectivity to the ScrapegraphAI API endpoint.Empty or incomplete data:
If the target website uses heavy JavaScript rendering and you do not enable theRender Heavy JSoption, the scraped data might be incomplete. Enable this option for such sites.Rate limiting or quota exceeded:
The external API may limit the number of requests. Check your ScrapegraphAI account limits if you encounter errors related to rate limits.
Links and References
- ScrapegraphAI Official Website
- ScrapegraphAI API Documentation (for detailed API usage and parameters)