🎯

scrapeninja

🎯Skill

from vm0-ai/vm0-skills

What it does

Scrapes websites with advanced anti-bot protection using Chrome TLS fingerprint, rotating proxies, and optional JavaScript rendering.

📦

Part of

vm0-ai/vm0-skills(138 items)

scrapeninja

Installation

Add MarketplaceAdd marketplace to Claude Code

/plugin marketplace add vm0-ai/vm0-skills

Install PluginInstall plugin from marketplace

/plugin install notion@vm0-skills

Install PluginInstall plugin from marketplace

/plugin install slack-webhook@vm0-skills

git cloneClone repository

git clone https://github.com/vm0-ai/vm0-skills.git

📖 Extracted from docs: vm0-ai/vm0-skills

Need more details? View full documentation on GitHub →

13Installs

AddedFeb 4, 2026

View on GitHub Back to Skills

Skill Details

SKILL.md

High-performance web scraping API with Chrome TLS fingerprint and JS rendering

Overview

# ScrapeNinja

High-performance web scraping API with Chrome TLS fingerprint, rotating proxies, smart retries, and optional JavaScript rendering.

> Official docs: https://scrapeninja.net/docs/

---

When to Use

Use this skill when you need to:

Scrape websites with anti-bot protection (Cloudflare, Datadome)
Extract data without running a full browser (fast /scrape endpoint)
Render JavaScript-heavy pages (/scrape-js endpoint)
Use rotating proxies with geo selection (US, EU, Brazil, etc.)
Extract structured data with Cheerio extractors
Intercept AJAX requests
Take screenshots of pages

---

Prerequisites

Get an API key from RapidAPI or APIRoad:

- RapidAPI: https://rapidapi.com/restyler/api/scrapeninja

- APIRoad: https://apiroad.net/marketplace/apis/scrapeninja

Set environment variable:

```bash

# For RapidAPI

export SCRAPENINJA_API_KEY="your-rapidapi-key"

# For APIRoad (use X-Apiroad-Key header instead)

export SCRAPENINJA_API_KEY="your-apiroad-key"

```

---

> Important: When using $VAR in a command that pipes to another command, wrap the command containing $VAR in bash -c '...'. Due to a Claude Code bug, environment variables are silently cleared when pipes are used directly.

> ```bash

> bash -c 'curl -s "https://api.example.com" -H "Authorization: Bearer $API_KEY"'

> ```

How to Use

1. Basic Scrape (Non-JS, Fast)

High-performance scraping with Chrome TLS fingerprint, no JavaScript:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com"

}

```

Then run:

```bash

bash -c 'curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: ${SCRAPENINJA_API_KEY}" -d @/tmp/scrapeninja_request.json' | jq '{status: .info.statusCode, url: .info.finalUrl, bodyLength: (.body | length)}'

```

With custom headers and retries:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"headers": ["Accept-Language: en-US"],

"retryNum": 3,

"timeout": 15

}

```

Then run:

```bash

bash -c 'curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: ${SCRAPENINJA_API_KEY}" -d @/tmp/scrapeninja_request.json'

```

2. Scrape with JavaScript Rendering

For JavaScript-heavy sites (React, Vue, etc.):

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"waitForSelector": "h1",

"timeout": 20

}

```

Then run:

```bash

bash -c 'curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: ${SCRAPENINJA_API_KEY}" -d @/tmp/scrapeninja_request.json' | jq '{status: .info.statusCode, bodyLength: (.body | length)}'

```

With screenshot:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"screenshot": true

}

```

Then run:

```bash

# Get screenshot URL from response

```

3. Geo-Based Proxy Selection

Use proxies from specific regions:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"geo": "eu"

}

```

Then run:

```bash

```

Available geos: us, eu, br (Brazil), fr (France), de (Germany), 4g-eu

4. Smart Retries

Retry on specific HTTP status codes or text patterns:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"retryNum": 3,

"statusNotExpected": [403, 429, 503],

"textNotExpected": ["captcha", "Access Denied"]

}

```

Then run:

```bash

bash -c 'curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape" --header "Content-Type: application/json" --header "X-RapidAPI-Key: ${SCRAPENINJA_API_KEY}" -d @/tmp/scrapeninja_request.json'

```

5. Extract Data with Cheerio

Extract structured JSON using Cheerio extractor functions:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://news.ycombinator.com",

"extractor": "function(input, cheerio) { let $ = cheerio.load(input); return $(\".titleline > a\").slice(0,5).map((i,el) => ({title: $(el).text(), url: $(el).attr(\"href\")})).get(); }"

}

```

Then run:

```bash

```

6. Intercept AJAX Requests

Capture XHR/fetch responses:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"catchAjaxHeadersUrlMask": "api/data"

}

```

Then run:

```bash

```

7. Block Resources for Speed

Speed up JS rendering by blocking images and media:

Write to /tmp/scrapeninja_request.json:

```json

{

"url": "https://example.com",

"blockImages": true,

"blockMedia": true

}

```

Then run:

```bash

bash -c 'curl -s -X POST "https://scrapeninja.p.rapidapi.com/scrape-js" --header "Content-Type: application/json" --header "X-RapidAPI-Key: ${SCRAPENINJA_API_KEY}" -d @/tmp/scrapeninja_request.json'

```

---

API Endpoints

| Endpoint | Description |

|----------|-------------|

| /scrape | Fast non-JS scraping with Chrome TLS fingerprint |

| /scrape-js | Full Chrome browser with JS rendering |

| /v2/scrape-js | Enhanced JS rendering for protected sites (APIRoad only) |

---

Request Parameters

Common Parameters (all endpoints)

|-----------|------|---------|-------------|

| retryNum | int | 1 | Number of retry attempts |

| geo | string | us | Proxy geo: us, eu, br, fr, de, 4g-eu |

| timeout | int | 10/16 | Timeout per attempt in seconds |

| statusNotExpected | int[] | [403, 502] | HTTP status codes that trigger retry |

JS Rendering Parameters (`/scrape-js`, `/v2/scrape-js`)

|-----------|------|---------|-------------|

| postWaitTime | int | - | Extra wait time after load (1-12s) |

---

Response Format

```json

{

"info": {

"statusCode": 200,

"finalUrl": "https://example.com",

"headers": ["content-type: text/html"],

"screenshot": "base64-encoded-png",

"catchedAjax": {

"url": "https://example.com/api/data",

"method": "GET",

"body": "...",

"status": 200

}

"body": "...",

"extractor": { "extracted": "data" }

}

```

---

Guidelines

Start with /scrape: Use the fast non-JS endpoint first, only switch to /scrape-js if needed
Retries: Set retryNum to 2-3 for unreliable sites
Geo Selection: Use eu for European sites, us for American sites
Extractors: Test extractors at https://scrapeninja.net/cheerio-sandbox/
Blocked Sites: For Cloudflare/Datadome protected sites, use /v2/scrape-js via APIRoad
Screenshots: Set screenshot: false to speed up JS rendering
Rate Limits: Check your plan limits on RapidAPI/APIRoad dashboard

---

Tools

Playground: https://scrapeninja.net/scraper-sandbox
Cheerio Sandbox: https://scrapeninja.net/cheerio-sandbox
cURL Converter: https://scrapeninja.net/curl-to-scraper

More from this repository10

🏪

vm0-ai-vm0-skills🏪Marketplace

VM0 SaaS integration skills for Claude Code

🎯

apify🎯Skill

Extracts web data by running pre-built or custom scrapers on any website, automating data collection and retrieval at scale.

🔌

resend🔌Plugin

Resend email API via curl. Use this skill to send transactional emails, manage contacts, domains, and API keys.

🔌

tavily🔌Plugin

Tavily AI search API integration via curl. Use this skill to perform live web search and RAG-style retrieval.

🔌

linear🔌Plugin

Linear issue tracking API via curl. Use this skill to create, update, and query issues, projects, and teams using GraphQL.

🔌

runway🔌Plugin

Runway AI API for video generation via curl. Use this skill to generate videos from images, text, or other videos.

🔌

deepseek🔌Plugin

DeepSeek AI large language model API via curl. Use this skill for chat completions, reasoning, and code generation with OpenAI-compatible endpoints.

🔌

serpapi🔌Plugin

SerpApi search engine results API via curl. Use this skill to scrape Google, Bing, YouTube, and other search engines.

🔌

youtube🔌Plugin

YouTube Data API v3 via curl. Use this skill to search videos, get video/channel info, list playlists, and fetch comments.

🔌

plausible🔌Plugin

Plausible Analytics API for querying website statistics and managing sites. Use this skill to get visitor counts, pageviews, traffic sources, and manage analytics sites.