npm-crawler

An n8n node to crawl and extract n8n community nodes information from npm registry

Package Information

Released: 11/19/2025
Downloads: 325 weekly / 325 monthly
Latest Version: 2.1.2
Author: Your Name

Documentation

n8n-nodes-npm-crawler

n8n.io - Workflow Automation

This is an n8n community node that allows you to crawl and extract information about n8n community nodes (or any npm packages) from the npm registry.

n8n is a fair-code licensed workflow automation platform.

Installation

Follow the installation guide in the n8n community nodes documentation.

Using npm

npm install n8n-nodes-npm-crawler

In n8n

  1. Go to Settings > Community Nodes
  2. Click Install
  3. Enter n8n-nodes-npm-crawler in the Enter npm package name field
  4. Click Install
  5. Agree to the risks of using community nodes

How It Works

This node crawls the npm registry search API to extract package information. It's designed to be simple and flexible:

  1. Enter a search query (e.g., "n8n-nodes-" for all n8n community nodes)
  2. Set total pages to "~" for all pages, or a number for specific pages
  3. Configure page size (how many results per page, max 250)
  4. Run - the node will crawl page by page and return all collected data

Parameters

Search Query (required)

  • Default: n8n-nodes-
  • Description: What to search for in the npm registry
  • Examples:
    • n8n-nodes- - All n8n community nodes
    • n8n-nodes-google - Google-related n8n nodes
    • react - All React packages
    • @types/ - All TypeScript type definitions

Total Pages (required)

  • Default: ~ (all pages)
  • Description: How many pages to crawl
  • Options:
    • ~ - Automatically crawl all available pages (recommended)
    • 10 - Crawl exactly 10 pages
    • 1 - Crawl only the first page

How "~" works:

  1. Node fetches the first page
  2. Calculates total pages from result count
  3. Crawls all remaining pages
  4. Stops when reaching the last page

Page Size (optional)

  • Default: 20
  • Min: 1
  • Max: 250
  • Description: Number of results per page

Request Delay (optional)

  • Default: 1000 ms
  • Description: Delay between requests to avoid rate limiting
  • Recommended: 1000-2000ms for reliability

Options (optional)

Timeout

  • Default: 30000ms (30 seconds)
  • Description: Request timeout in milliseconds

Retry on Error

  • Default: false
  • Description: Whether to retry failed requests

Max Retries

  • Default: 3
  • Description: Maximum number of retries (only when "Retry on Error" is enabled)

Output

The node returns a JSON object with the following structure:

{
  "totalCount": 560,
  "searchQuery": "n8n-nodes-",
  "crawledAt": "2025-01-19T10:30:00.000Z",
  "nodes": [
    {
      "name": "n8n-nodes-example",
      "version": "1.0.0",
      "description": "Example node description",
      "keywords": ["n8n", "automation"],
      "author": {
        "name": "Developer Name"
      },
      "publisher": {
        "username": "publisher",
        "email": "email@example.com"
      },
      "maintainers": [...],
      "links": {
        "npm": "https://www.npmjs.com/package/n8n-nodes-example",
        "homepage": "...",
        "repository": "..."
      },
      "publishedDate": "2024-01-01T00:00:00.000Z",
      "score": {
        "final": 0.75,
        "detail": {
          "quality": 0.8,
          "popularity": 0.7,
          "maintenance": 0.75
        }
      },
      "searchScore": 0.85
    }
    // ... more nodes
  ]
}

Use Cases

1. Build a Node Directory

Crawl all n8n nodes and store them in a database:

[Manual Trigger]
  → [N8N Nodes Crawler]
      Search Query: n8n-nodes-
      Total Pages: ~
  → [Function (Transform Data)]
  → [Airtable/Database (Store)]

2. Monitor New Nodes

Daily check for new community nodes:

[Schedule Trigger (Daily)]
  → [N8N Nodes Crawler]
      Search Query: n8n-nodes-
      Total Pages: ~
  → [Filter (Compare with Previous)]
  → [Slack (Notify New Nodes)]

3. Search Specific Categories

Find all Google-related n8n nodes:

[Manual Trigger]
  → [N8N Nodes Crawler]
      Search Query: n8n-nodes-google
      Total Pages: ~
  → [Google Sheets (Export)]

4. Analyze Popular Packages

Find the most popular React packages:

[Manual Trigger]
  → [N8N Nodes Crawler]
      Search Query: react
      Total Pages: 5
  → [Function (Sort by Score)]
  → [Email (Send Report)]

Tips & Best Practices

Getting Complete Data

  • Always use "~" for Total Pages when you need all results
  • The node automatically detects the last page
  • No need to guess or manually count pages

Avoiding Rate Limits

  • Set Request Delay to at least 1000ms (1 second)
  • For large crawls, use 2000ms or higher
  • Enable Retry on Error for reliability

Performance

  • Page Size: Larger = fewer requests, but may hit limits
    • Recommended: 20-50 for general use
    • Max 250 for faster crawls
  • Total Pages: Using "~" is recommended for completeness
    • Use specific numbers only for testing or partial data

Search Tips

  • Be specific: n8n-nodes-google vs google
  • Use prefixes: @types/ for TypeScript definitions
  • Combine with n8n filters for advanced queries

Example Workflows

Complete Node Catalog

[Schedule: Weekly]
  → [N8N Nodes Crawler]
      Search Query: n8n-nodes-
      Total Pages: ~
      Page Size: 50
  → [Set (Transform)]
  → [MongoDB (Upsert by name)]
  → [Slack (Send Summary)]

Quality Analysis

[Manual Trigger]
  → [N8N Nodes Crawler]
      Search Query: n8n-nodes-
      Total Pages: ~
  → [Function (Calculate Stats)]
      - Average score
      - Category distribution
      - Top maintainers
  → [Google Sheets (Create Report)]

Troubleshooting

"Failed to fetch first page"

  • Cause: Network issue or npm registry down
  • Solution: Check internet connection, retry later

Crawling stops early

  • Cause: Invalid Total Pages value
  • Solution: Use "~" for all pages or a valid number

Rate limiting errors

  • Cause: Too many requests too quickly
  • Solution: Increase Request Delay parameter

Incomplete results

  • Cause: npm API pagination limits
  • Solution: Use default Page Size (20) for best results

Version History

2.0.0

  • Breaking Change: Unified interface - no more separate operations
  • Universal Search Query parameter
  • Simplified Total Pages - works for all searches
  • Cleaner, more intuitive UI

1.1.0

  • Smart Total Pages with "~" symbol
  • Automatic page detection
  • Enhanced progress logging

1.0.0

  • Initial release
  • Basic crawling functionality
  • Separate Get All Nodes and Search Nodes operations

Development

See DEVELOPMENT.md for detailed development instructions.

Quick Start

# Install dependencies
npm install

# Build
npm run build

# Test locally
npm link
cd ~/.n8n
npm link n8n-nodes-npm-crawler
n8n start

Resources

License

MIT

Support

If you have any issues or questions, please:

  1. Check QUICKSTART.md for common usage patterns
  2. Review CHANGELOG.md for recent changes
  3. Open an issue on GitHub

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Discussion