framefetch_extract

Active

Tool of io.github.MarvinRey7879/framefetch

declared in 1.1.0

Extract structured data from ONE public social-video URL (YouTube incl. Shorts, TikTok, Instagram Reels, Pinterest, Reddit). Purpose: turn a video link into metadata (title, author, duration, date), insights (views/likes/comments), a transcript (captions, or Whisper when there are none — works on TikTok/Reddit too), parametrically-sampled video frames, and/or the on-screen text burned into those frames (OCR — captions, price tags, signage, lower-thirds). When to use: you have a video URL and need its text, stats, frames, or on-screen text for analysis, summarization, or grounding a model. When NOT to use: non-video pages, private/login-walled content, or bulk crawling (one URL per call). Returns: one JSON object containing only the requested fields plus a `cost` block (micro-USD). Frames come back as time-limited signed image URLs; text_overlay returns one entry per frame with the OCR text, per-line confidence, and bounding boxes. Cost/latency: metadata is sub-cent and fast; transcript is billed per audio-minute, frames per frame, and text_overlay per frame on top of that (all three also incur bandwidth for frames) — request only the fields you need and downscale frames via `width` to control cost. Billing: a free tier covers light use; agents can also pay per call with x402 (USDC) with no account. Example: { "url": "https://www.youtube.com/watch?v=...", "fields": ["metadata","transcript"], "frames": { "mode": "fps", "fps": 1, "width": 480 } }

Parameters schema

{
  "type": "object",
  "required": [
    "url"
  ],
  "properties": {
    "url": {
      "type": "string",
      "description": "Public video URL from a supported platform."
    },
    "fields": {
      "type": "array",
      "items": {
        "enum": [
          "metadata",
          "insights",
          "transcript",
          "frames",
          "text_overlay"
        ],
        "type": "string"
      },
      "description": "Which data to return. Default [\"metadata\"]. Request the minimum you need. \"text_overlay\" (on-screen text via OCR) requires \"frames\" to also be requested, and is auto-skipped (with a warning, not an error) above 200 estimated frames — OCR runs a subprocess per frame, so keep the frames spec narrow (fps/range) when you want it."
    },
    "frames": {
      "type": "object",
      "required": [
        "mode"
      ],
      "properties": {
        "n": {
          "type": "number",
          "description": "every_n: take every Nth source frame"
        },
        "to": {
          "type": "number",
          "description": "range: end second"
        },
        "fps": {
          "type": "number",
          "description": "fps/range: frames per second (max 60)"
        },
        "from": {
          "type": "number",
          "description": "range: start second"
        },
        "mode": {
          "enum": [
            "all",
            "every_n",
            "fps",
            "range"
          ],
          "type": "string"
        },
        "width": {
          "type": "number",
          "description": "downscale width in px (16–7680)"
        },
        "format": {
          "enum": [
            "jpg",
            "png",
            "webp"
          ],
          "type": "string"
        }
      },
      "description": "Frame-sampling spec; required when \"frames\" or \"text_overlay\" is requested."
    },
    "verbosity": {
      "enum": [
        "concise",
        "detailed"
      ],
      "type": "string"
    }
  }
}

What this tool wraps· 2 endpoints

min confidence0.70 0.50

Parent server

io.github.MarvinRey7879/framefetch

https://github.com/MarvinRey7879/framefetch

1/7 registries

View full server →