Does agentic_web_crawl support JavaScript-rendered pages?

No. It performs a standard HTTP fetch, so it works with server-rendered HTML. Pages that require JavaScript to render their content (SPAs, React apps without SSR) will return the raw HTML shell without dynamic content.

Does it follow redirects?

Yes. agentic_web_crawl follows HTTP redirects automatically and returns data for the final destination URL, including the resolved canonical URL.

Is there a rate limit or cost for using it?

No. agentic_web_crawl is free and has no built-in rate limit. It makes standard HTTP requests from your machine — the same as curl. Be mindful of the target site's own rate limiting policies.

How is this different from agentic_web_search?

agentic_web_crawl fetches a specific URL you provide and returns its full content and metadata. agentic_web_search takes a search query and returns ranked results from the web — like a search engine. Use crawl when you know the URL; use search when you need to discover relevant pages.

Why isn't the og:image or Twitter card showing up in the output?

Some pages don't set these tags. If the field is missing from the output, it means the tag wasn't present in the HTML of the page at the time of the fetch.

AgenticData

Web Crawl & Page Extractor MCP Tool

CLI Tool Name: agentic_web_crawl

Fetches any publicly accessible URL and returns structured page content with a full SEO signal report. Designed for AI agents that need to read web pages, audit content, or extract structured metadata — without relying on search APIs or browser automation.

Parameters

Parameter	Type	Required	Description
url	string	yes	The full URL to fetch and analyze, including the protocol (https://).

What it extracts

Page text & content

Full readable text extracted from the page body, word count.

SEO metadata

Title, meta description, canonical URL, robots directives.

Open Graph tags

og:title, og:description, og:image, og:type — for social sharing previews.

Twitter Card

twitter:card, twitter:site, twitter:creator.

Heading structure

All H1–H6 headings extracted in order — reveals content architecture.

Link breakdown

Internal vs. external link counts, nofollow links.

Common use cases

→Read and summarize any web page without leaving your AI client

→Audit competitor landing pages for SEO gaps

→Verify that your own pages have correct meta tags and canonical URLs

→Extract structured content from web pages for research

→Check that Open Graph tags are set correctly before publishing

→Monitor pages for changes in title, description, or heading structure

Example output

json

{
  "url": "https://example.com/article",
  "status_code": 200,
  "seo": {
    "title": "Example Article Title",
    "meta_description": "A concise description of the page content.",
    "canonical_url": "https://example.com/article",
    "robots": "index, follow",
    "word_count": 1842
  },
  "open_graph": {
    "og:title": "Example Article Title",
    "og:description": "A concise description of the page content.",
    "og:image": "https://example.com/og-image.jpg",
    "og:type": "article"
  },
  "twitter_card": {
    "twitter:card": "summary_large_image",
    "twitter:site": "@example"
  },
  "headings": {
    "h1": ["Example Article Title"],
    "h2": ["Introduction", "Key Concepts", "Conclusion"],
    "h3": ["Subpoint A", "Subpoint B"]
  },
  "links": {
    "internal": 12,
    "external": 5,
    "nofollow": 2
  }
}

[info]

This tool makes an outbound HTTP request to the URL you provide. No external dependencies or API keys required. Works immediately after installing the MCP server.

Client integrations

Learn how to connect the agentic_web_crawl tool to your AI agent:

→ Setup for Claude Desktop → Setup for Cursor → Setup for Windsurf → Setup for VS Code

Explore other AgenticStore MCP tools

→ Web Search MCP Tool (query-based, SearXNG)→ Python Code Linter MCP Tool → Secret & Repo Scanner MCP Tool → Dependency Vulnerability Scanner MCP Tool

← dependency_audit agentic_web_search →

Frequently asked questions