Overview
The Scrappey node enables advanced web requests with built-in anti-bot protection bypass using the Scrappey API. It supports three main operation modes:
- Request Builder: Create customized HTTP or browser requests with detailed configuration options such as HTTP method, headers, cookies, proxies, and more.
- HTTP Request • Auto-Retry on Protection: Automatically retries HTTP requests when blocked by CAPTCHA, Cloudflare, or similar anti-bot measures, resending the same payload, headers, cookies, and proxy settings.
- Browser Request • Auto-Retry & Anti-Bot: Executes browser-based requests with anti-bot techniques like movement emulation and hCaptcha/Cloudflare bypass, automatically retrying if protection pages are encountered.
This node is beneficial for workflows that require scraping or interacting with websites protected by anti-bot mechanisms, enabling reliable data extraction or automation where standard HTTP requests fail.
Practical examples:
- Scraping product details from e-commerce sites that use Cloudflare protection.
- Automating form submissions on websites with hCaptcha challenges.
- Extracting data from APIs or pages that block repeated requests without proper session handling or proxy rotation.
Properties
| Name | Meaning |
|---|---|
| Scrappey Operations | Choose the operation mode: - Request Builder - HTTP Request • Auto-Retry on Protection - Browser Request • Auto-Retry & Anti-Bot |
| URL | The target page URL to scrape (required for Request Builder). |
| HTTP Method | HTTP method to use (GET, POST, PUT, DELETE, PATCH, PUBLISH) for Request Builder. |
| Request Type | Type of request in Request Builder: - Browser - Request (standard HTTP) - Patched Chrome Browser |
| Which Proxy To Use | Select proxy source: - Proxy from credentials - Proxy from HTTP Request Node - Proxy from Scrappey |
| Proxy Type | Proxy category when using Scrappey proxy: - Residential proxy - Premium residential proxy - Datacenter proxy - Mobile proxy |
| Custom proxy country | Enable to specify a proxy country. |
| Custom Proxy Country | Select the country for the proxy to use (if enabled). |
| Custom proxy | When enabled, uses the proxy defined in credentials for this request (only for Request Builder with Scrappey proxy type "Residential"). |
| Body OR Params? | For methods like POST/PUT/PATCH/DELETE/PUBLISH, choose whether to send parameters in the body or as URL params. |
| Params | Parameters string to send with the request (when using Params option). |
| Body | Raw body content to send with the request (when using Body option). |
| User Session | Identifier for user session to maintain state across requests (optional). |
| Headers Input Method | Choose how to input headers: - Using fields below - Using JSON object |
| Custom Headers | Key-value pairs for custom headers (when using fields input method). |
| JSON Headers | JSON string representing headers (when using JSON input method). |
| One String Cookie | Use a single string format for cookies instead of key-value pairs. |
| Single String Cookie | Cookie string in name=value;name2=value2 format (used if One String Cookie is true). |
| Custom Cookies | Key-value pairs for cookies (used if One String Cookie is false). |
| Datadome | Enable bypass for Datadome protection (only for Browser request type). |
| Attempts | Number of attempts to make the request if it fails (1 to 3). |
| Antibot | Enable automatic solving of hCaptcha and reCAPTCHA challenges (only for Browser request type). |
| Add Random mouse movement | Simulate human interaction by adding random mouse movements during the browser session (only for Browser request type). |
| Record Video Session | Record a video of the browser session for debugging purposes (only for Browser request type). |
| CSS Selector | CSS selector to target specific elements on the page (only for Browser request type). |
| Href (Optional) | URL to navigate to when the CSS selector is used (only for Browser request type). |
| Intercept XHR/Fetch Request | Intercept and return data from a specific XHR/Fetch request instead of the main page content (only for Browser request type). |
Output
The node outputs an array of items, each containing a json field with the response data from the Scrappey API. The structure of the JSON depends on the operation and the response from the API but generally includes the scraped content or the result of the HTTP/browser request.
If multiple results are returned, each is output as a separate item with its original input index paired.
Binary data output is not explicitly mentioned, so the node primarily returns JSON-formatted data.
Dependencies
- Requires an active Scrappey API key credential configured in n8n.
- Uses the Scrappey API endpoint at
https://api.scrappey.com. - For proxy usage, requires appropriate proxy credentials or configuration either via Scrappey or external proxy sources.
- For browser-based operations, dependencies include Puppeteer or similar headless browser technology managed internally by Scrappey.
Troubleshooting
Common issues:
- Failure due to invalid or missing API key credential.
- Requests blocked by anti-bot protections if not properly configured with retries or browser mode.
- Proxy misconfiguration leading to connection failures.
- Incorrect CSS selectors or URLs causing empty or unexpected responses.
- Exceeding allowed number of attempts or rate limits from Scrappey API.
Error messages:
- Errors during processing an input item will include the error message and the original input data.
- If
continueOnFailis disabled, the node will stop execution on the first error with a descriptive message including the item index.
Resolutions:
- Verify API key and proxy credentials.
- Use the auto-retry modes for sites with anti-bot protections.
- Adjust the number of attempts property to allow retries.
- Check and correct CSS selectors or URLs.
- Review Scrappey API usage limits and adjust workflow accordingly.