Actions5
- Crawler Actions
- Deep SerpApi Actions
- Universal Scraping API Actions
Overview
The "Web Unlocker" operation of the "Universal Scraping API" resource in this node is designed to help users bypass common web scraping protections and restrictions on target websites. It enables fetching web pages that might otherwise block automated requests, such as those protected by anti-bot measures or requiring JavaScript rendering.
Typical use cases include:
- Extracting data from websites that use JavaScript heavily for content loading.
- Accessing geo-restricted content by specifying a country location.
- Avoiding detection by blocking certain resource types (images, fonts, scripts) to speed up scraping or reduce bandwidth.
- Running custom JavaScript instructions on the page before extraction to manipulate or wait for dynamic content.
For example, a user can input a URL of a product page that uses client-side rendering and specify to render JavaScript with headless browsing enabled, optionally selecting a country proxy to simulate access from a specific region.
Properties
| Name | Meaning |
|---|---|
| Target URL | The URL of the webpage to scrape/unlock. This is the main target for the operation. |
| Js Render | Boolean flag to enable or disable JavaScript rendering on the page. When true, the page will be rendered with JavaScript executed, useful for dynamic content. |
| Headless | Boolean flag to run the browser in headless mode (no visible UI). Typically true for automated scraping tasks. |
| Country | Select the country code to route the request through a proxy located in that country. Useful for accessing geo-restricted content. Options include many countries worldwide plus "World Wide" (ANY). |
| Js Instructions | JSON array of instructions for JavaScript execution on the page, e.g., waiting for a certain time or running custom scripts. Default is [{"wait":100}] which waits 100 milliseconds. |
| Block | JSON object specifying resources and URLs to block during page load to optimize performance or avoid unwanted content. Example blocks images, fonts, scripts, or specific URLs. |
Output
The node outputs a JSON object containing the scraped/unlocked webpage data. The exact structure depends on the API response but typically includes:
- The full HTML content of the unlocked page.
- Metadata about the request and response.
- Possibly extracted data fields if configured.
- No explicit binary output is indicated, so all data is returned as JSON.
Dependencies
- Requires an API key credential for the Scrapeless service to authenticate requests.
- The node relies on the Universal Scraping API endpoint provided by Scrapeless.
- Network access to the target URLs and possibly proxy routing depending on the selected country.
- No additional environment variables are explicitly required beyond the API credential.
Troubleshooting
Common issues:
- Invalid or missing API key credential will cause authentication errors.
- Target URL not reachable or blocked by the API service.
- Incorrect JSON format in
Js InstructionsorBlockproperties may cause parsing errors. - Selecting a country proxy that is unavailable or restricted could lead to failed requests.
Error messages:
"Unsupported resource": Occurs if the resource parameter is set incorrectly; ensure it is "universalScrapingApi".- API errors related to quota limits or invalid parameters will be returned in the error message field.
Resolutions:
- Verify API credentials and permissions.
- Check URL accessibility outside n8n.
- Validate JSON inputs using online validators.
- Try different country options or disable JS rendering if unnecessary.