DataForSEO icon

DataForSEO

DataForSEO

Overview

The "Parse Page Content" operation of the "On Page" resource in this node allows you to fetch and analyze the content of a specified web page URL. It can optionally emulate browser rendering, execute JavaScript on the page, handle XMLHttpRequests, and customize various browser-like parameters to closely mimic real user browsing behavior.

This operation is useful for scenarios such as:

  • Extracting structured data or raw HTML from dynamic web pages that require JavaScript execution.
  • Performing SEO audits by analyzing page content after full rendering.
  • Scraping content from pages that load resources asynchronously or require specific headers or user agents.
  • Testing how a page appears under different device screen sizes or locales.

Practical examples:

  • Fetching the fully rendered HTML of a product page that loads content via JavaScript.
  • Parsing meta tags and visible text after cookie consent popups are disabled.
  • Emulating a mobile device viewport to see mobile-specific content variations.

Properties

Name Meaning
Target Page URL The URL of the web page to parse. This is required.
Load the scripts available on a page? Whether to enable JavaScript execution on the page, allowing dynamic content loading. Defaults to false.
Emulate browser rendering? Enables full browser rendering including styles, images, fonts, animations, videos, and other resources, simulating a real browser environment. Defaults to false.
Enable XMLHttpRequest on a page? Allows XMLHttpRequests (XHR) to be executed on the page, enabling asynchronous data fetching. Defaults to false.
Additional Fields A collection of optional advanced settings:
- Custom User Agent Specify a custom User-Agent string to send with the request.
- Custom Javascript Inject custom JavaScript code to run on the page after it loads.
- Preset for browser screen parameters Choose a preset device type for screen parameters: Empty, Desktop, Mobile, or Tablet.
- Browser Screen Width Set a custom screen width in pixels for the emulated browser viewport.
- Browser Screen Height Set a custom screen height in pixels for the emulated browser viewport.
- Browser Screen Scale Factor Set the scale factor (pixel density) for the emulated browser viewport.
- Store HTML of a crawled page? If enabled, the raw HTML of the crawled page will be stored in the output. Defaults to false.
- Disable the popup requesting cookie consent from the user? Prevents cookie consent popups from appearing during page load. Defaults to false.
- Accept Language Sets the Accept-Language HTTP header to specify the preferred language/locale for the request (e.g., "en-US", "fr", etc.).
- Switch proxy pool? Enables switching between proxy pools for requests. Defaults to false.
- Proxy Pool Selects the proxy pool to use if switching is enabled. Options include Empty, US, and DE.

Output

The node outputs an array of JSON objects representing the parsed page content for each input item.

The json output typically includes:

  • Parsed data extracted from the page after applying the requested options (e.g., rendered HTML, metadata, or other structured information depending on the API response).
  • If "Store HTML of a crawled page?" is enabled, the raw HTML content of the page is included.
  • Other fields depend on the underlying API response but generally relate to the page's content and audit results.

The node does not explicitly mention binary data output for this operation.

Dependencies

  • Requires an active connection to the DataForSEO API service.
  • Needs an API key credential configured in n8n for authentication.
  • Network access to the target URLs and possibly proxy configuration if using proxy pools.
  • No additional external libraries beyond those bundled with the node.

Troubleshooting

  • Common issues:

    • Invalid or unreachable URL: Ensure the "Target Page URL" is correct and accessible.
    • JavaScript rendering failures: Some complex pages may fail to render properly; try toggling the "Emulate browser rendering?" or "Load the scripts available on a page?" options.
    • Proxy errors: If using proxy pools, verify proxy availability and credentials.
    • Cookie consent popups interfering with parsing: Enable "Disable the popup requesting cookie consent from the user?" to avoid blocking content.
  • Error messages:

    • "Something went wrong": Generic error indicating the operation function was not found or failed; check that the Resource and Operation are correctly set.
    • Network or timeout errors: May occur if the target page is slow or blocked; consider increasing timeouts or checking network connectivity.
    • Authentication errors: Verify that the API key credential is valid and has necessary permissions.

Links and References

Discussion