glim_web_fetch
ActiveTool of glim.sh
Fetch a single web page and extract clean content. Auto-tier server-side: handles SSR (Next.js, Nuxt, TikTok, Pinterest, YouTube), SPA shells, PDFs, paywall detection, residential-proxy escalation, and stealth profiles for TikTok / Instagram tags / Pinterest / Threads-search / YouTube. Returns clean markdown (default) with a YAML frontmatter header (url, outcome, total_chars). Read 'outcome' to classify the result (success | teaser | thin_content | paywall | bot_challenge | consent_wall | login_wall | rate_limited | timeout | transient_upstream | unsupported_target | not_found | error). Large pages (>80k chars) are truncated inline with truncated_chars + a download_full_url to the complete extraction (expires ~1h). Permanently unsupported (outcome=unsupported_target, cost=0 upstream): Bluesky searchPosts, IG profile/post pages, Threads profile/post pages, Truth Social, Xiaohongshu.
Parameters schema
{
"type": "object",
"$schema": "http://json-schema.org/draft-07/schema#",
"required": [
"url"
],
"properties": {
"url": {
"type": "string",
"description": "URL to fetch"
},
"format": {
"enum": [
"markdown",
"html",
"json"
],
"type": "string",
"default": "markdown",
"description": "Output format. 'markdown' (default) clean article text; 'html' raw cleaned HTML; 'json' the structured SSR blob (TikTok / Pinterest / YouTube) instead of article text."
},
"selector": {
"type": "string",
"description": "CSS selector to scope extraction"
}
}
}No endpoints wrapped at confidence ≥ 0.70.
Parent server
glim.sh
https://github.com/glim-sh/glim-mcp
1/7 registries