BrightData icon

BrightData

Interact with Bright Data to scrape websites or use existing datasets from the marketplace to generate adapted snapshots

Overview

The node interacts with the Bright Data platform to list snapshots of a specified dataset from the Marketplace Dataset resource. It allows users to retrieve metadata about different snapshots filtered by their status and optionally by a specific view ID. This is useful for workflows that need to monitor or process data snapshots, such as data analysis pipelines, automated reporting, or integration with other systems that consume snapshot data.

Practical examples include:

  • Automatically fetching all "ready" snapshots of a dataset to trigger downstream processing.
  • Filtering snapshots by status to handle only those that are completed or in progress.
  • Using the view ID to narrow down snapshots relevant to a particular perspective or subset of the dataset.

Properties

Name Meaning
Dataset Select the dataset from the marketplace to list snapshots for. Supports searching datasets by name.
View (Optional) The ID of the view used to filter snapshots within the selected dataset.
Status Filter snapshots by their current status. Options include: Building, Canceled, Collecting, Delivering, Digesting, Failed, Pending Developer Review, Pending Discovery Input, Pending Owner Review, Pending PDP Input, Queued For Developer Review, Ready, Rolling Back, Scheduled, Validating. Default is "Ready".

Output

The node outputs JSON data representing the list of snapshots matching the specified filters. Each snapshot object typically contains metadata such as snapshot ID, status, creation date, and possibly other descriptive fields related to the snapshot's state and content.

If binary data output is supported (not explicitly shown in the provided code), it would represent snapshot files or related downloadable content, but this is not indicated here.

Dependencies

  • Requires an API key credential for authenticating with the Bright Data platform.
  • The node makes HTTP requests to the Bright Data API endpoint https://api.brightdata.com.
  • No additional external dependencies are indicated beyond the API access.

Troubleshooting

  • Common issues:
    • Invalid or missing API credentials will cause authentication failures.
    • Selecting a non-existent dataset or view ID may result in empty results or errors.
    • Filtering by a status that does not match any snapshots will return no data.
  • Error messages:
    • HTTP errors from the API (e.g., 401 Unauthorized) indicate credential problems.
    • 404 Not Found may occur if the dataset or view ID is invalid.
  • Resolutions:
    • Verify that the API key credential is correctly configured and has necessary permissions.
    • Confirm dataset and view IDs exist and are accessible.
    • Adjust status filters to valid values as per the options list.

Links and References

Discussion