Web Crawl & Page Extractor MCP Tool
CLI Tool Name: agentic_web_crawl
Fetches any publicly accessible URL and returns structured page content with a full SEO signal report. Designed for AI agents that need to read web pages, audit content, or extract structured metadata — without relying on search APIs or browser automation.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| url | string | yes | The full URL to fetch and analyze, including the protocol (https://). |
What it extracts
Page text & content
Full readable text extracted from the page body, word count.
SEO metadata
Title, meta description, canonical URL, robots directives.
Open Graph tags
og:title, og:description, og:image, og:type — for social sharing previews.
Twitter Card
twitter:card, twitter:site, twitter:creator.
Heading structure
All H1–H6 headings extracted in order — reveals content architecture.
Link breakdown
Internal vs. external link counts, nofollow links.
Common use cases
Example output
{
"url": "https://example.com/article",
"status_code": 200,
"seo": {
"title": "Example Article Title",
"meta_description": "A concise description of the page content.",
"canonical_url": "https://example.com/article",
"robots": "index, follow",
"word_count": 1842
},
"open_graph": {
"og:title": "Example Article Title",
"og:description": "A concise description of the page content.",
"og:image": "https://example.com/og-image.jpg",
"og:type": "article"
},
"twitter_card": {
"twitter:card": "summary_large_image",
"twitter:site": "@example"
},
"headings": {
"h1": ["Example Article Title"],
"h2": ["Introduction", "Key Concepts", "Conclusion"],
"h3": ["Subpoint A", "Subpoint B"]
},
"links": {
"internal": 12,
"external": 5,
"nofollow": 2
}
}Client integrations
Learn how to connect the agentic_web_crawl tool to your AI agent: