glim_web_fetch

Active

declared in 1.0.0

Fetch a single web page and extract clean content. Auto-tier server-side: handles SSR (Next.js, Nuxt, TikTok, Pinterest, YouTube), SPA shells, PDFs, paywall detection, residential-proxy escalation, and stealth profiles for TikTok / Instagram tags / Pinterest / Threads-search / YouTube. Returns clean markdown (default) with a YAML frontmatter header (url, outcome, total_chars). Read 'outcome' to classify the result (success | teaser | thin_content | paywall | bot_challenge | consent_wall | login_wall | rate_limited | timeout | transient_upstream | unsupported_target | not_found | error). Large pages (>80k chars) are truncated inline with truncated_chars + a download_full_url to the complete extraction (expires ~1h). Permanently unsupported (outcome=unsupported_target, cost=0 upstream): Bluesky searchPosts, IG profile/post pages, Threads profile/post pages, Truth Social, Xiaohongshu.

Parameters schema

{
  "type": "object",
  "$schema": "http://json-schema.org/draft-07/schema#",
  "required": [
    "url"
  ],
  "properties": {
    "url": {
      "type": "string",
      "description": "URL to fetch"
    },
    "format": {
      "enum": [
        "markdown",
        "html",
        "json"
      ],
      "type": "string",
      "default": "markdown",
      "description": "Output format. 'markdown' (default) clean article text; 'html' raw cleaned HTML; 'json' the structured SSR blob (TikTok / Pinterest / YouTube) instead of article text."
    },
    "selector": {
      "type": "string",
      "description": "CSS selector to scope extraction"
    }
  }
}

What this tool wraps· 0 endpoints

min confidence0.70 0.50

No endpoints wrapped at confidence ≥ 0.70.

Parent server

glim.sh

https://github.com/glim-sh/glim-mcp

1/7 registries

View full server →