BrightData icon

BrightData

Interact with Bright Data to scrape websites or use existing datasets from the marketplace to generate adapted snapshots

Overview

The node interacts with the Bright Data platform to retrieve snapshots of web scraping datasets. Specifically, the "Get Snapshots" operation under the "Web Scraper" resource allows users to fetch metadata about snapshots generated from a selected dataset within a specified date range and filtered by snapshot status. This is useful for monitoring and managing data collection progress or results from web scraping tasks.

Common scenarios include:

  • Retrieving recent snapshots of a dataset to analyze changes over time.
  • Filtering snapshots by their processing status (e.g., only those ready or failed).
  • Paginating through large numbers of snapshots using skip and limit parameters.

Practical example:
A user wants to get all snapshots marked as "Ready" for a particular dataset collected in the last week, limiting the results to 50 entries starting from the first record.

Properties

Name Meaning
Dataset Select the dataset from which to retrieve snapshots.
Status Filter snapshots by their status. Options include: Building, Canceled, Collecting, Delivering, Digesting, Failed, Pending Developer Review, Pending Discovery Input, Pending Owner Review, Pending PDP Input, Queued For Developer Review, Ready, Rolling Back, Scheduled, Validating.
Skip Number of snapshots to skip (for pagination).
Limit Maximum number of snapshot results to return (minimum 1).
From Date Start date (ISO 8601 format) to filter snapshots by creation or update time.
To Date End date (ISO 8601 format) to filter snapshots by creation or update time.

Output

The node outputs JSON data representing the list of snapshots matching the query criteria. Each snapshot object typically contains metadata such as its ID, status, timestamps, and possibly other descriptive fields related to the dataset snapshot.

If binary data is returned (not indicated here), it would represent raw scraped content or files associated with snapshots, but this operation focuses on metadata retrieval.

Dependencies

  • Requires an API key credential for authenticating with the Bright Data platform.
  • The node communicates with the Bright Data API endpoint at https://api.brightdata.com.
  • No additional external dependencies are indicated.

Troubleshooting

  • Invalid or missing API credentials: Ensure that a valid API key credential is configured in n8n for the Bright Data service.
  • Date range errors: The "From Date" and "To Date" must be valid ISO 8601 dates; invalid formats may cause request failures.
  • No snapshots returned: Check if the dataset ID is correct and if snapshots exist for the given filters and date range.
  • API rate limits or network issues: May cause request failures or incomplete data; verify network connectivity and API usage limits.
  • Status filter misuse: Using an incorrect status value will result in no matches; use one of the predefined status options.

Links and References

Discussion