Question 1

Does Website Intel work on websites that require JavaScript to render?

Accepted Answer

Yes. Website Intel uses a headless browser (Playwright via crawl4ai) to fully render pages before extraction. This means single-page applications built with React, Vue, Angular, or any other JavaScript framework are fully supported. The page is rendered just like a real browser would see it.

Question 2

What happens if the website blocks scraping or requires login?

Accepted Answer

Website Intel works with publicly accessible pages. If a website requires authentication or blocks automated access, the extraction will fail gracefully. It does not support logging into websites or bypassing access controls. For gated content, you would need to provide an alternative data source.

Question 3

How accurate is the data extraction compared to manual copy-paste?

Accepted Answer

The LLM-powered extraction is highly accurate for structured content like pricing tables, team directories, feature lists, and product catalogs. For less structured content like blog posts or marketing copy, accuracy depends on how specific your schema and extraction prompt are. More specific prompts yield better results.

Question 4

Can I crawl an entire website?

Accepted Answer

You can crawl up to 10 pages per request in crawl mode. The crawler follows links from the starting page and extracts data from each page it visits. For larger sites, you can run multiple crawl requests targeting different sections (e.g., /pricing, /team, /blog).

Question 5

Do I need to pay for an API key to use Website Intel?

Accepted Answer

Yes. Website Intel requires an LLM API key (OpenAI, Anthropic, or compatible provider) for the intelligent extraction step. The scraping and crawling itself is free — you only pay for the LLM calls used to interpret page content against your schema.

Question 6

What is the difference between scrape mode and crawl mode?

Accepted Answer

Scrape mode processes a single URL — it loads the page, renders JavaScript, and extracts data. Crawl mode starts at a URL and follows links to discover additional pages, extracting data from each one. Use scrape for a single page like a pricing table, and crawl for multi-page content like a blog archive or team directory.

Name	Type	Required	Description
`url`	string	Yes	Full URL to process (must include http/https protocol)
`schema`	object	Yes	JSON Schema defining the desired output data structure
`prompt`	string	Yes	Natural language extraction instructions describing what data to pull
`mode`	string	No	"scrape" for single page with JS rendering, or "crawl" for multi-page link following. Default: "scrape"
`limit`	integer	No	Maximum pages to crawl in crawl mode. Range: 1–10. Default: 5

Website Intel

Overview

Use Cases

Extract Pricing Page Data for Competitive Analysis

Build a Prospect List from a Conference Speaker Page

Scrape Product Feature Lists for Qualification

Capabilities

Data Sources

Tools

Parameters

Response Fields

Dependencies

Works With

Used in Skills

Qualify Account

Quick Setup

Frequently Asked Questions

Related

Techstack Intel

Social Intel

Review Intel

Need help with this MCP?